Dispelling Myth about Dictation / Speech Recognition

When I first started writing this I had just finished listening to episode 156 of The Bestseller Experiment; as a patron supporter I get early access to episodes, as well as being a member of the wonderful BXP Team. The marvellous episode focused on interviewing the author Julian Barr about his new book The Way Home. Julian is also a long time listener and member of the BXP Team. I highly recommend Julian’s book, a gripping tale that was well paced, characters with connections and motivations. His book has also now earned an Amazon bestseller tag! I’m very much looking forward to the next book in the series.

Important paranoid associated thought: like many writers I feel like a fraud that just needs to write more and thus I feel awkward about asking for advice, after all I’ve already answered my own request for advice “Write more!” Anyway, later in the episode the two Marks discuss writing using Speech Recognition (SR) and gave a call-to-action regarding listeners experiences with writing via dictation. I was surprised to find that I felt empowered and not a fraud, since this is a topic I know quite well.

As someone with long-term chronic Repetitive Strain Injury (RSI) in both of my wrists I have a lot of experience with speech recognition, going back nearly twenty years to the horrendous days of massively inaccurate software; the frustration and stress of trying to use the software often made me feel even worse! Fortunately the various programs have improved so dramatically in the last ten years that I find dictating to be dramatically faster, easier and shockingly more efficient. The vast improvements have come about because of the following factors:

  1. Understanding of what is involved in analysing language (technical).
  2. Improved code efficiency (technical).
  3. Substantially increased computer processing power (brute force).

This also means that modern speech recognition is better are recognising accent and voice differences. With training, software should adapt to work near perfect for most users; I appreciate that is quite a bold claim.

As someone that used to be able to maintain a decent enough typing speed of between 70 to 80 words per minute (WPM), having that ability taken away from me was devastating; I was unable to work or partake of most of my hobbies. Having struggled through the horrid early years of dictation I can appreciate why people are loathe to give speech recognition a try, however just about every problem has gone away these days.

In general many people are not up to date with the latest information when it comes to cutting-edge technology; after all there is so much to do/learn. This is in part because the various non-specialist media outlets are often years behind when reporting non sensational things, there is so much to talk about and typically they repeat the same core points. In this ever-accelerating technologically era I suspect anyone that has not used modern speech recognition has heard opinions that are about software from 10+ years ago. My title was not an attempt at clickbait, when I discuss or read things about speech recognition there is an understandable fixation on accuracy, but with modern software claiming accuracy of 90%+ for most people with little to no training, and 95%+ with some training, I wonder why accuracy is still considered a barrier to entry. It seems like my system is 99% accurate, but I appreciate it has been used a lot over many years. My point is that typically most people will type errors anyway, even with grammar and spell checkers mistakes slip through. Even for those that manage a rare 100% accuracy the first time they type something the result should still be double-checked. Mistakes are still made, accuracy is a concern whether typing or spoken, so why not do the vast majority of the work via speech?

When I was working in adult social services I had severe RSI flare-up, in fact my worst ever that caused a domino of problems. When I returned to work for a while I was able to cope due to using speech recognition, despite being in a large busy office. I was surprised at how accurate it was even with all the background conversations. Additionally instead of using a mouse to navigate the screen I found using commands to finally be efficient. How things had changed!

During long bouts of sleep deprivation I can somewhat rest my eyes whilst dictating. Thankfully I rarely get headaches, but dictation has also proved helpful when I have; I find it’s better to do something than nothing, since I’ll be suffering either way.

I’d like to highlight that a hybrid approach can be used. Especially if you can still type and you want to, then do so. Can be quite easy with today’s smartphones maybe you can use speech recognition whilst away from your normal work area. For the following reasons I’d recommend at least experimenting.

Speech Recognition Pros & Cons

Pro 1: Health

When dictating we don’t need to be sat down or stood still, we are not tied to a keyboard. Since we can move about I often do so. Over the years I have done all manner of things whilst dictating: physiotherapy, light exercise/stretching, to things like cleaning or ironing, etc. When I am having a particularly painful wrist episode my arms, shoulder and back all become problematic, resulting in difficulties sitting or standing for any length of time, so on a particularly bad days I’ve even dictated whilst resting in bed.

Con 1: Training Time Investment

Like any new skill there can be a learning curve, which can vary dramatically from person to person. Although these days even without any training on a modern device and software, dictation can start out at 90%+ accuracy.

I appreciate that getting out of comfort zones and allocating time to learn something, can be challenging. Saying embrace the challenge is all well and good, but people and their situations can vary wildly. It is sensible to decide during an epically busy time that doing something new is too much of a risk, but because life is strange maybe the change will quickly be beneficial, even in regards to time, which links to Pro 2 …

Pro 2: Speed

Personally, I think the health reason is reason enough but just in case here is another reason. Just because a person is good at typing does not mean they should stick with that method, since dictating can allow them to be faster. I often find it easy to dictate over a 100WPM, sometimes as high as 150WPM; granted a few typists with specialist keyboards can beat that, but for the vast majority of people dictation is twice as fast typing.

Following on from Con 1, it is worth learning the extra functions like how to navigate via dictation, as well as the various advanced commands. Going from quick dictation to struggling to carry out navigation commands can make you feel like a writing session was ruined; writers typically have enough reasons to procrastinate without imagining new ones 😉

Speed is a major factor for writing events like #NaNoWriMo, thus the speed advantage of dictation can really pay off.

Con 2: Initial Costs

Not everyone has a computer (desktop/laptop/tablet) or smartphone (I’m only differentiating because so many people typically do, as it is really just a computer with a phone function). Free speech recognition exists but I do find Dragon NaturallySpeaking to be better overall, but it isn’t cheap.

Then there is the topic of what microphone to use. Whilst you can use a laptop’s built in microphone it is better to have a decent microphone, although I’ve found that a £25 microphone works just as well as my more expensive Yeti, so you don’t have to buy crazy equipment.

Other extras: I’ve also invested in a microphone stand, pop-filter, USB cable extension and a high quality wireless headset. The extension and wireless the reason I can exercise or tidy my room whilst dictating.

One of the problems I found using my fantastic quality Yeti microphone was there were a few delays/problems with the software, but this was because I had leaned back in my chair and thus wasn’t close enough to the microphone. So before you rush off to buy an expensive microphone consider how your setup can be altered to get improvements.

Pro 3: Speaking is Natural + Rhythm of Speaking

Based off this subtitle you can see why Nuance called their software NaturallySpeaking 😉 Particularly when dictating dialogue I find I can write a better scene; I think this down to being able to somewhat act the scene out, I feel more in character as I switch back and forth between character perspectives. I’ve even experimented with literally acting a scene out, although that led to some comedy moments of frantically changing my position to be the correct character, like a stand-up performance.

Sometimes we can spend a lot of time thinking about a subject only to find that when we speak we change what we had intended to say. There is something about speaking out loud; maybe it is because we engage more of a body, thus more of our brain. I also think this is probably a knock-on effect of evolution in regards to us being such a social species, we need to be careful of what we say to others.

One of the best tips for writers is: “Read your writing out loud.” Dictating can be a big help, you get used to speaking out loud, thus when it comes time to edit your work you are more likely to give it a try. This also links to one of the key tips from Bestseller Experiment, “Make a public declaration.”

There is another advantage to dictating. If you think of a sentence and then struggle to dictate it, then that is a sign there is a problem. Typically you’ll easily find a rhythm, indicating were commas and full stops best fit; granted you have to say “comma”, but I think that is no different to having to press the comma key. Maybe somebody who struggles with grammar could benefit from dictation?

Con 3: Editing

As I mentioned above I think this is a con that gets too much attention, since work should be double-checked anyway. Still it can be particularly irksome during the training period, when correcting (editing) as you go is highly recommended. I think a valid point about the accuracy aspect is that they are typically errors that we are aware of, unlike when most people type and things slip through.

Crucially this is a problem that fades over time, I rarely need to correct things. Since I write fantasy fiction and role-playing games I also have lots of additions for my fantasy proper nouns, my system mostly recognises these new words after the initial correction or two. Just like with typing it is more important to get something written first, then you have something to edit.

Pro 4: Flow

Due to the pain from my disability, I lost my ability to enter a flow state whilst writing/typing. It was 2009 when this this feeling briefly appeared during dictation. My comfort level with dictating slowly grow over the years, by 2009 I found talking to my computer to be more than only comfortable but also empowering.

Con 4: Habits

Initially when first learning to use speech recognition a user can feel they are wasting their time. Why bother stressing yourself out, fighting your habits? I’ve separated this point from Con 1: Training, because I think habits/traditions are such a powerful part of our psychology.

Habits are typically difficult to break; various people can react differently to the same thing. Decades ago I had the regular association of being denied the use of my wrists to type a decent work session, the threat of pain from typing as well as sitting too long, plus stress and sleep deprivation. Since back then speech recognition was lacking, I quickly developed justifications about putting things off. In the light of pain-paranoia and frustration it became easy to justify thoughts like “I need to minimise computer usage even using dictation, so I need to work out as much as possible upfront.” Once I developed this habit I found it hard to break it, even as the ability of speech recognition improved.

Pro 5: Focus

I find I do not get distracted as much when I am dictating. Maybe because I am typically away from my desk, so I cannot easily check emails or browse. It can seem like our hands have a mind of their own when within a split second of thinking about a website we’ve switched to that. This is why so many writers use blocking software that restricts their access to the Internet. Following on from Pro 3, I find that if I do start giving my computer commands to browse non-important things I quickly stop myself.

Con 5: Stream of Consciousness

Dictating does not dictate quality. The fact we can dictate more WPM means we can also have more to edit. This is a minor Con, yes I’m being nit-picky, but over the years I have dictated a lot of garbage. I think I have solved this by writing more, showing others my work, learning more about writing; not just practice, but learning to carry out skilled practice. If you feel that when you start dictating you are writing garbage, don’t worry I think you’ll quickly adapt.

Bonus Pro: Moving is Thinking

Linking back to Pro 3: Speaking is Natural, there is something about moving and thinking, dictation means you don’t have to be sat still at a keyboard. When we move we are activating different brain regions, plus getting the blood flowing, etc. Physical intelligence is one of the many types of intelligence being researched, plus whilst kinaesthetic leaners are typically separated from other learning types, the majority of people can learn in all manners of ways including kinaesthetic. Quick interesting point, animals have a more developed brain than plants because they need to navigate; the sea squirt is a fascinating creature that once it finds a permanent spot for its next stage of life eats its own brain. It is also worth looking into the tools of memory specialists and how they utilise virtual spaces to associate memories for better recall.

Some speech recognition software allows for the transcribing of previously recorded speech. You can even transcribe a recording of another person, although I’ve never done this and I am not sure of the efficiency of the process.

I’ll be making a video version of the blog in the New Year, but before I finish here are so extra points. Dictating role-playing mechanics is not a big deal, I’ve even used speech recognition to dictate computer code years ago; I am contemplating giving it another go with the vastly improved software and machine power of today.

Whether walking outside or in bed trying to sleep (chronic pain is hell), I’ve dictated notes via my smartphone’s built in software. Granted it is not as powerful as Dragon, but it is easy to do and I don’t have to get out of bed. I’ve also made use of a Dictaphone with a headset whilst walking, that I’ve later dictated at home, this counted as a first draft. Dragon Anywhere allows for dictating on the go, but I cannot afford it and I am rarely out and I have Dragon 15.

In conclusion if you are still not sure if speech recognition is for you, I highly recommend giving it a go, at least go hybrid, mix things up. The future is already happening!

Links

I’ve written about The Bestseller Experiment before.

The Bestseller Experiment Podcast

Julian Barr

NaNoWriMo

Advertisements

GollanczFest 2017 part 2

This continues on from my Bestseller and GollanczFest post.

Day 1 of GollanczFest 2017 was on Saturday November 4th. Richie and I had of time to chat about writing on the train as we travelled into London. Since Richie was going to the Writer’s Worksop at the Phoenix, whilst I was going to Foyles for the Panels, we knew we’d have plenty to discuss on the return train home.

I had heard Foyles was quite an impressive bookshop, I can confirm it certainly is, and thus it made for a great location for the event. Since I had over an hour till the panels started I perused the many floors of extensive bookshelves. Eventually I purchased John Yorke’s Into the Woods and settled down to reading for a while; a book both Richie and Mark Stay highly recommend. By the time I went back to the top floor for the event’s start there was quite an impressive queue.

Foyles

Upon entering the room for the panels, we were given a tote pack. The tote bag’s print design is quite impressive. The bag contained an overview of the event, some striking samples of forthcoming fiction, plus some water and even a chocolate bar. It certainly set the event up well, especially the water, since I’d forgotten to bring any; sadly I had also forgotten to bring Moo & Bat with me for photos.

GollanczFest tote bag

I got to briefly talk with Mark Stay who was working on the event, both on and off stage. The Bestseller Experiment podcast had officially started at the GollanczFest 2016, and it was great to see Mr Stay at this year’s event, but with the bonus that he is now a bestselling author; shame Mr Desvaux couldn’t be there, but it’s a very long way for him to come for a weekend.

Mark Stay
Mark Stay guardian of the Author-Portal for #GollanczFest 2017

One thing I regret is not talking to any of the people I was sat near. Undoubtedly anyone that had turned up was passionate about reading, and likely writing. Whilst it is likely the room had many shy book readers, it was unlikely any of us was going to freak out, quietly of course, and run out of the run if one of us said hello. Whilst I don’t consider myself shy anymore, I still fell back in to very old habits of feeling awkward about striking up a conversation with the few people I’d made eye contact with; which is extra odd since I had been talking to Mark Stay only minutes earlier. Looking around the room, there were people chatting. Thankfully it was only ten minutes until the first panel started, and armed with my mobile I did a little bit of writing and then checked Twitter.

Since so much happened at the numerous panels, I’ve decided to do separate posts about each of them. I still have lots of fiction writing to do today so I’ll write about the first panel another time: Who you gonna call? Ghostwriters!

I’ve currently focused on the Gingerbread competition, since the deadline of December 4th is fast approaching. I’ve made it my NaNoWriMo writing challenge, and although attending the GollanczFest really made me want to return to my main writing project, I have managed to stay on target; well, not my word count target, but something is better than nothing.

PS – Richie is the person that runs http://www.richiedigital.co.uk/

Bestseller and GollanczFest

Last week, November 4th & 5th, was quite the experience as I went to the Gollancz Festival. The event was held at Foyles, a rather grand bookshop in London. I had wanted to go to the writer’s workshop, but that had sold out. Fortunately for me I won tickets to the main Gollancz Festival via the Bestseller Experiment podcast.

Since my friend Richie was going to the writer’s workshop the event also had an extra appeal; we already chat a lot about our writing, and it’s rare we meet up these days. Plus I had not visited his home yet, so after a brief discussion an extended visit was planned.

One thing about long train journeys is at least there is plenty of chance for reading and writing. Even for someone like myself who suffers from travel sickness, trains are generally tolerable for me, plus when I did feel a bit off I stopped writing and changed to listening to an audio book.

Joining me on my journey were Moo & Bat, my mini-fluffy-sidekicks. I planned on taking some silly pictures of them on the train and at the festival, in part because I’ve been thinking through some children story ideas. Plus the Adventures of Moo & Bat amuses my wife.

The Adventures of Moo & Bat

I’ll write about the Saturday morning Gollancz Festival panels next time. I’ll end this short post by highlighting that the Bestseller Experiment has a Patreon fund. Considering the value Mark Stay & Mark Desvaux have provided with this great podcast, it is something I am happy to support even though I currently have no income due to health problems. Just to clarify, I had backed them before I knew I had won Gollancz Festival tickets 😉 It would be quite sad if the podcast does not continue, and it’s worth considering what quality & quantity season 2 could provide, so please consider getting involved.

https://www.patreon.com/bestsellerexperiment

I quite enjoyed Mark Stay’s recent interview with Cover to Cover.

Part 2 of my GollanczFest visit.

The Bestseller Reality

This post continues on from my last blog about the launch of the Bestseller Experiment of their Back to Reality book. The book successfully became a best seller, gaining the special orange tag on Amazon!

During the livestreams on launch day there was discussion about what is next for the two Marks. One of the authors I follow is Gavin G Smith, who introduced me to the podcast, and he suggested making the future of the podcast about winning a Man Booker Prize for Fiction! Given the success of both the podcast and the book, it does not sound ridiculous to suggest such a goal.

I found the book launch on Monday to be quite captivating; I joined in two livestreams and watched the other two after they were broadcast. I admit that a few months ago I was not certain that the experiment would be a success. With the final week’s daily podcasts, plus the frenetic activity by the beta readers (Experimentalists), the momentum for the launch convinced me that they’d succeed. It was great to see Neil Gaiman tweeting about the book, followed by this funny moment:

So many great writers were involved in the podcast, and I plan on going through the summary in their epic PDF Vault of Gold again soon. Given then typical busy lives of writers, and the many demands for their attention, it is understandable that some of my favourites, like Mr Gaiman, were not able to join in; another reason to keep the podcast going, even more epic guests!

As normal this week chronic pain has interfered with my sleep, but at least my mood is high, in part due to channelling the success of the two Marks in to my own writing this week. I am working on the Gingerbread competition that was announced on the Bestseller podcast; I am currently writing two different stories, since I really wanted to write both. The wife and I’ve nearly finished our caterpillar cake:

Back To Reality Bat

You can keep up to date with what happens next with the Bestseller Experiment at their website.

The Bestseller Experiment Mark Stay Mark Oliver (Desvaux)

 

Back to Reality

Today is the launch of the book by the Bestseller Experiment, a wonderful podcast that I have been following, and written about before. The hard work that Mark Stay and Mark Oliver (Desvaux) did on the podcast has really paid off with regards to the book. The great banter in the podcast was a big hint for what they were writing, and I think the Mark’s delivered their signature style well.

The good news is that the book has already reached number 1 in music:

Back To Reality no 1

It will be interesting to see how things develop over the next few weeks. As well as what will happen with the brilliant podcast; in the first live stream for the book launch, Mark Stay explained there will be a podcast break.

As a member of the beta team (Experimentalists) I got to read the book before its launch. The launch team was amazing, and a very interesting experience. Sadly I was not as quick as my fellow Experimentalists in regards to giving feedback. The team is another example of the many things a professional author should consider.

Back To Reality Bat

My review is:

This feel good story initially starts out following the unhappy life of Jo, her dull life degenerates further, but then Yohanna turns up and things get weird! It’s certainly a page turner, as Jo deals with a comedic level of problems. Despite the scale of the protagonist’s problems, somehow the two Marks manage to maintain a light tone and some great humour.
Be warned this funny story also has character depth, interesting scenes all wrapped up with some heartfelt profundity. This good mix of fun and seriousness helped with the books pacing, I was intrigued to know what would happen next. In the spirit of this books good use of nostalgia, the story has a Play Your Cards Right feel: “Is the next scene’s emotion higher or lower than a 7?”
I recommend this fun read, particularly to anyone wanting a good balance of joy, sadness with a good sprinkling of gravitas about life choices. I rate this book a caterpillar cake, definitely worth consumption.

I am greatly looking forward to what else the two Marks do. In the meanwhile tracking the books position in various charts will be fun. You can get your copy of the book now via the smart link at http://bestsellerexperiment.com/backtoreality/