Dispelling Myth about Dictation / Speech Recognition

When I first started writing this I had just finished listening to episode 156 of The Bestseller Experiment; as a patron supporter I get early access to episodes, as well as being a member of the wonderful BXP Team. The marvellous episode focused on interviewing the author Julian Barr about his new book The Way Home. Julian is also a long time listener and member of the BXP Team. I highly recommend Julian’s book, a gripping tale that was well paced, characters with connections and motivations. His book has also now earned an Amazon bestseller tag! I’m very much looking forward to the next book in the series.

Important paranoid associated thought: like many writers I feel like a fraud that just needs to write more and thus I feel awkward about asking for advice, after all I’ve already answered my own request for advice “Write more!” Anyway, later in the episode the two Marks discuss writing using Speech Recognition (SR) and gave a call-to-action regarding listeners experiences with writing via dictation. I was surprised to find that I felt empowered and not a fraud, since this is a topic I know quite well.

As someone with long-term chronic Repetitive Strain Injury (RSI) in both of my wrists I have a lot of experience with speech recognition, going back nearly twenty years to the horrendous days of massively inaccurate software; the frustration and stress of trying to use the software often made me feel even worse! Fortunately the various programs have improved so dramatically in the last ten years that I find dictating to be dramatically faster, easier and shockingly more efficient. The vast improvements have come about because of the following factors:

  1. Understanding of what is involved in analysing language (technical).
  2. Improved code efficiency (technical).
  3. Substantially increased computer processing power (brute force).

This also means that modern speech recognition is better are recognising accent and voice differences. With training, software should adapt to work near perfect for most users; I appreciate that is quite a bold claim.

As someone that used to be able to maintain a decent enough typing speed of between 70 to 80 words per minute (WPM), having that ability taken away from me was devastating; I was unable to work or partake of most of my hobbies. Having struggled through the horrid early years of dictation I can appreciate why people are loathe to give speech recognition a try, however just about every problem has gone away these days.

In general many people are not up to date with the latest information when it comes to cutting-edge technology; after all there is so much to do/learn. This is in part because the various non-specialist media outlets are often years behind when reporting non sensational things, there is so much to talk about and typically they repeat the same core points. In this ever-accelerating technologically era I suspect anyone that has not used modern speech recognition has heard opinions that are about software from 10+ years ago. My title was not an attempt at clickbait, when I discuss or read things about speech recognition there is an understandable fixation on accuracy, but with modern software claiming accuracy of 90%+ for most people with little to no training, and 95%+ with some training, I wonder why accuracy is still considered a barrier to entry. It seems like my system is 99% accurate, but I appreciate it has been used a lot over many years. My point is that typically most people will type errors anyway, even with grammar and spell checkers mistakes slip through. Even for those that manage a rare 100% accuracy the first time they type something the result should still be double-checked. Mistakes are still made, accuracy is a concern whether typing or spoken, so why not do the vast majority of the work via speech?

When I was working in adult social services I had severe RSI flare-up, in fact my worst ever that caused a domino of problems. When I returned to work for a while I was able to cope due to using speech recognition, despite being in a large busy office. I was surprised at how accurate it was even with all the background conversations. Additionally instead of using a mouse to navigate the screen I found using commands to finally be efficient. How things had changed!

During long bouts of sleep deprivation I can somewhat rest my eyes whilst dictating. Thankfully I rarely get headaches, but dictation has also proved helpful when I have; I find it’s better to do something than nothing, since I’ll be suffering either way.

I’d like to highlight that a hybrid approach can be used. Especially if you can still type and you want to, then do so. Can be quite easy with today’s smartphones maybe you can use speech recognition whilst away from your normal work area. For the following reasons I’d recommend at least experimenting.

Speech Recognition Pros & Cons

Pro 1: Health

When dictating we don’t need to be sat down or stood still, we are not tied to a keyboard. Since we can move about I often do so. Over the years I have done all manner of things whilst dictating: physiotherapy, light exercise/stretching, to things like cleaning or ironing, etc. When I am having a particularly painful wrist episode my arms, shoulder and back all become problematic, resulting in difficulties sitting or standing for any length of time, so on a particularly bad days I’ve even dictated whilst resting in bed.

Con 1: Training Time Investment

Like any new skill there can be a learning curve, which can vary dramatically from person to person. Although these days even without any training on a modern device and software, dictation can start out at 90%+ accuracy.

I appreciate that getting out of comfort zones and allocating time to learn something, can be challenging. Saying embrace the challenge is all well and good, but people and their situations can vary wildly. It is sensible to decide during an epically busy time that doing something new is too much of a risk, but because life is strange maybe the change will quickly be beneficial, even in regards to time, which links to Pro 2 …

Pro 2: Speed

Personally, I think the health reason is reason enough but just in case here is another reason. Just because a person is good at typing does not mean they should stick with that method, since dictating can allow them to be faster. I often find it easy to dictate over a 100WPM, sometimes as high as 150WPM; granted a few typists with specialist keyboards can beat that, but for the vast majority of people dictation is twice as fast typing.

Following on from Con 1, it is worth learning the extra functions like how to navigate via dictation, as well as the various advanced commands. Going from quick dictation to struggling to carry out navigation commands can make you feel like a writing session was ruined; writers typically have enough reasons to procrastinate without imagining new ones 😉

Speed is a major factor for writing events like #NaNoWriMo, thus the speed advantage of dictation can really pay off.

Con 2: Initial Costs

Not everyone has a computer (desktop/laptop/tablet) or smartphone (I’m only differentiating because so many people typically do, as it is really just a computer with a phone function). Free speech recognition exists but I do find Dragon NaturallySpeaking to be better overall, but it isn’t cheap.

Then there is the topic of what microphone to use. Whilst you can use a laptop’s built in microphone it is better to have a decent microphone, although I’ve found that a £25 microphone works just as well as my more expensive Yeti, so you don’t have to buy crazy equipment.

Other extras: I’ve also invested in a microphone stand, pop-filter, USB cable extension and a high quality wireless headset. The extension and wireless the reason I can exercise or tidy my room whilst dictating.

One of the problems I found using my fantastic quality Yeti microphone was there were a few delays/problems with the software, but this was because I had leaned back in my chair and thus wasn’t close enough to the microphone. So before you rush off to buy an expensive microphone consider how your setup can be altered to get improvements.

Pro 3: Speaking is Natural + Rhythm of Speaking

Based off this subtitle you can see why Nuance called their software NaturallySpeaking 😉 Particularly when dictating dialogue I find I can write a better scene; I think this down to being able to somewhat act the scene out, I feel more in character as I switch back and forth between character perspectives. I’ve even experimented with literally acting a scene out, although that led to some comedy moments of frantically changing my position to be the correct character, like a stand-up performance.

Sometimes we can spend a lot of time thinking about a subject only to find that when we speak we change what we had intended to say. There is something about speaking out loud; maybe it is because we engage more of a body, thus more of our brain. I also think this is probably a knock-on effect of evolution in regards to us being such a social species, we need to be careful of what we say to others.

One of the best tips for writers is: “Read your writing out loud.” Dictating can be a big help, you get used to speaking out loud, thus when it comes time to edit your work you are more likely to give it a try. This also links to one of the key tips from Bestseller Experiment, “Make a public declaration.”

There is another advantage to dictating. If you think of a sentence and then struggle to dictate it, then that is a sign there is a problem. Typically you’ll easily find a rhythm, indicating were commas and full stops best fit; granted you have to say “comma”, but I think that is no different to having to press the comma key. Maybe somebody who struggles with grammar could benefit from dictation?

Con 3: Editing

As I mentioned above I think this is a con that gets too much attention, since work should be double-checked anyway. Still it can be particularly irksome during the training period, when correcting (editing) as you go is highly recommended. I think a valid point about the accuracy aspect is that they are typically errors that we are aware of, unlike when most people type and things slip through.

Crucially this is a problem that fades over time, I rarely need to correct things. Since I write fantasy fiction and role-playing games I also have lots of additions for my fantasy proper nouns, my system mostly recognises these new words after the initial correction or two. Just like with typing it is more important to get something written first, then you have something to edit.

Pro 4 Flow:

Due to the pain from my disability, I lost my ability to enter a flow state whilst writing/typing. It was 2009 when this this feeling briefly appeared during dictation. My comfort level with dictating slowly grow over the years, by 2009 I found talking to my computer to be more than only comfortable but also empowering.

Con 4 Habits:

Initially when first learning to use speech recognition a user can feel they are wasting their time. Why bother stressing yourself out, fighting your habits? I’ve separated this point from Con 1: Training, because I think habits/traditions are such a powerful part of our psychology.

Habits are typically difficult to break; various people can react differently to the same thing. Decades ago I had the regular association of being denied the use of my wrists to type a decent work session, the threat of pain from typing as well as sitting too long, plus stress and sleep deprivation. Since back then speech recognition was lacking, I quickly developed justifications about putting things off. In the light of pain-paranoia and frustration it became easy to justify thoughts like “I need to minimise computer usage even using dictation, so I need to work out as much as possible upfront.” Once I developed this habit I found it hard to break it, even as the ability of speech recognition improved.

Pro 5 Focus:

I find I do not get distracted as much when I am dictating. Maybe because I am typically away from my desk, so I cannot easily check emails or browse. It can seem like our hands have a mind of their own when within a split second of thinking about a website we’ve switched to that. This is why so many writers use blocking software that restricts their access to the Internet. Following on from Pro 3, I find that if I do start giving my computer commands to browse non-important things I quickly stop myself.

Con 5 Stream of Consciousness:

Dictating does not dictate quality. The fact we can dictate more WPM means we can also have more to edit. This is a minor Con, yes I’m being nit-picky, but over the years I have dictated a lot of garbage. I think I have solved this by writing more, showing others my work, learning more about writing; not just practice, but learning to carry out skilled practice. If you feel that when you start dictating you are writing garbage, don’t worry I think you’ll quickly adapt.

Bonus Pro Moving is Thinking:

Linking back to Pro 3 Speaking is Natural, there is something about moving and thinking, dictation means you don’t have to be sat still at a keyboard. When we move we are activating different brain regions, plus getting the blood flowing, etc. Physical intelligence is one of the many types of intelligence being researched, plus whilst kinaesthetic leaners are typically separated from other learning types, the majority of people can learn in all manners of ways including kinaesthetic. Quick interesting point, animals have a more developed brain that plants because they need to navigate; the sea squirt is a fascinating creature that once it finds a permanent spot for its next stage of life eats its own brain. It is also worth looking into the tools of memory specialists and how they utilise virtual spaces to associate memories for better recall.

I’ll be making a video version of the blog in the new year, but before I finish here are so extra points. Dictating role-playing mechanics is not a big deal, I’ve even used speech recognition to dictate computer code years ago; I am contemplating giving it another go with the vastly improved software of today. When out or when in bed trying to sleep (chronic pain is hell), I’ve dictated notes via my smartphone’s built in software, granted it is not as powerful as Dragon, but it is easy to do and I don’t have to get out of bed. I’ve also made use of a Dictaphone with a headset whilst walking, that I’ve later dictated at home, this counted as a first draft; Dragon Anywhere allows for dictating on the go, but I cannot afford it and I am rarely out and I have Dragon 15.

In conclusion if you are still not sure if speech recognition is for you, I highly recommend giving it a go, at least go hybrid, mix things up. The future is already happening!

Links

I’ve written about The Bestseller Experiment before.

The Bestseller Experiment Podcast

Julian Barr

NaNoWriMo

Advertisements

RPG Welcome to the Technocractic Union

I have uploaded an in-character video for my current Mage 20th chronicle. I originally started work on this a while ago, but due to health problems, limited time and not knowing much about making videos I placed the project on hold. My health has improved a bit, plus I’ve had time to think and do a bit more research, so I’ve remade the video; I accept that my video making and acting skills will take time to improve.

This introductory video is part of a series for my current World of Darkness chronicle, focusing on Mage 20th. There are a whole bunch of things I think I need to work on, but at least it’s a start.

 

Videos for #RPGaDay2018

Following on from my last post about #RPGaDay2018, I have finally started making videos. I decided to embrace my lack of video making knowledge, mediocre acting skills, along with a typical dose of nerves; there is no pressure if it is supposed to poor and silly? I intentionally kept the rat voice as my own, honest.

 

#RPGaDay2018 is Coming

#RPGaDay is an annual event that runs in August, with 31 questions about role-playing to promote positive conversation about all RP & RPG; it’s now in its 5th year. I thoroughly enjoyed answering last year’s #RPGaDay (my #RPGaDay2017 ) and I have been looking forward to this year’s questions. Once again David F Chapman (@autocratik) and Anthony Boyd (@Runeslinger) have organised the questions, along with Will Brooks (@willbrooks1989) providing the lovely graphic design.

As there have been so many RP related questions over the years, David put out a request for questions for this year. I was keen to join in, so I reread the questions from previous years and then spent a week pondering ideas. I eventually had a sizeable list covering: quirky concepts, art & music, and surprising successes from failures. I appreciated that the positive from negatives could psychological prime a participant to think of too many negatives things and possibly undermine the aim of the event, but I thought the reframing of a bad situation would be appreciated by most. I was pleased to find that a few of my questions made it onto the list; I’m aware others may have asked similar questions 😉

I am really impressed with this year’s questions; David and Anthony have done a great job. Even more impressive given how many previous questions have been asked. There are also some alternate questions at David’s blog http://autocratik.blogspot.com/2018/07/

#RPGaDay #RPGaDay2018

#RPGaDay2018

Inspirational Friends

Back in January 2018 my friend Richie Janukowicz started a vlog with the aim of doing a video a day. Richie has been involved in a few podcasts, Geek Pride and Noobgrind websites, as well as various creative projects, so he knew what he was letting himself in for.

I decided I would comment on every video, a simple way of keeping in touch. On some days it has been hard to figure out what to write, which is a bit pathetic but also a good example of self-imposed pressure; technically I don’t need to entertain, just comment.

I’ve not written a blog for months, besides my usual health issues I have preferred to focus on writing and design. Thankfully I’ve gotten a lot of writing done, including finishing a story. I have also been building towards making some videos about role-playing, which Richie and I have talked about many times. Like most people it has been an issue of allocating time and thus what to give up. I found Richie’s daily vlogs inspiring, somebody with his own health problems, as well family and work commitments has managed it. The short form of Richie’s videos as his workload increased is also a reminder that you can still hit a target by adjusting things. Whilst long form video essays are something to aspire to, they are extremely time-consuming, and that quality level is something to build towards, not demand of yourself at the start.

On an academic note, although there are many content creators on YouTube with an impressive number of followers and a lucrative number of views, apparently the vast majority of YouTube’s content is family centric content viewed by just a few relatives. So Richie’s videos also provide his friends and family to keep somewhat up to date with him, as well as being a diary. I presume Dr Michael Wesch: “An anthropological introduction to YouTube” is still applicable.

GollanczFest 2017 part 6

This continues on from my first post about the Gollancz Festival 2017.

After the morning panels were finished I got another chance to talk with Mark Stay (Orion Publishing, Author & The Bestseller Experiment). It was an informal chat with Mr Stay, as he was on hand for any customer or author enquiries. We had a chance to discuss The Bestseller Experiment, and briefly touch on some of the other projects he had mentioned in some of the bonus video chats the two Marks had done, and of course at the time the big query regarding the future of The Bestseller Experiment. I managed to avoid pitching my current projects, and when Mark asked about my work I give a concise overview; I think I did well considering how much I’d have liked to have said 😉

Mark Stay
Mark Stay guardian of the Author-Portal for #GollanczFest 2017

Mr Stay’s welcoming professionalism was even more impressive in person. I had planned on writing about a few things that Mark had highlighted in our chat, like things to keep in mind when discussing a subject that readers and writers alike are so emotionally invested in. Helpfully Mark recently wrote about this subject on his blog: 25 things I’ve learned from 25 years in books… He has also touched on many of these points on The Bestseller Experiment.

Once I knew Richie had finished the morning Writers’ Workshop sessions at Phoenix we meet up for lunch and spent the majority of it frothing about writing. As a bonus I got to have some of my favourite food: sweet buns.

Buns

I was quite curious about the Writers’ Workshops, since I had tried to get tickets but it hold sold out. Richie (Richie Digital) has written a lot over the years, he has had a variety of interesting jobs, including a background in community filmmaking. He explained that many of the people at the workshop talked about being in the early stages of writing, and they got good advice from the various authors of note. He also received some great answers, plus since he has actually finished a book, he received the bonus advice of: “What are you doing here? Just get it published.” Like so many productive people it comes down to managing competing priorities, and of course the typical writer’s overly-critical of their own work. Richie said he left the workshops with new inspiration, hopefully 2018 will see his work get picked up.

Gingerbread and NaNoWriMo 2017 p2

Continuing on from my previous post about the Gingerbread competition.

I successfully sent my entry for Gingerbread’s ‘One in Four’ before the deadline, which involves Trapeze Books (part of Orion) and The Pool UK. I received a confirmation email a few minutes later so thankfully my paranoia was somewhat alleviated, not entirely of course. Whilst the book is not complete, I’ll take finishing the competition portion as completing something 😉

In 2016 I entered the Richard and Judy book competition. Whilst I managed to hit the deadline, I also disliked writing it. The story was about a family recovering from losing a child in a school shooting, and I simply didn’t enjoy writing it; no surprise given the subject matter. Part of the reason was that I was still massively struggling with my health then. This Gingerbread story is different, after struggling to get going and keep momentum I started to enjoy things. Also, considering how much effort I’ve put in it would be silly to put it on hold. So the plan is to split my writing between my fantasy social services setting, non-fiction work on my role-playing guide, plus continuing this Gingerbread ‘One in Four’ tale.

A shout-out to my editor-extraordinaire Damian who also said the story so far was good. His feedback gave me a lot of confidence as I was going through my final tweaks and proofreading. Damian has helped me many times over the years, from helping me edit my rulebooks when I worked at KJC Games as well as several fiction writing projects; his eye for detail is impressively high.

I plan to resume posts about my Gollancz Festival 2017 experience next.