Comparing Captioned Phones: ClearCaptions Vs. CapTel

The interwebs are severely lacking any objective comparisons of the two major captioned (landline) telephones on the market: The ClearCaptions Ensemble and the CapTel 840i.

I’ve been using the CapTel for a few years, but the ClearCaptions Ensemble has a 90-day trial period. So I figured I had nothing to lose by trying it out.

Before I get to the captioning quality, which is admittedly the most important aspect, here are a few notes on other aspects of the phones:

Appearance/User Interface
CapTel 840i

CapTel 840i

The CapTel phone is decidedly unsexy. It is pretty old, large and clumsy. I’ve been using the 840i, which does not have a touch screen, for a few years. However, when I went to the CapTel site, I see they now have a touchscreen version, the 2400i. I may have to try that in the future.

The ClearCaptions Ensemble looks much nicer. It is also all touchscreen, except for the power button. However, the touchscreen interface is horrendous. As I said to a friend, “It has a touch interface, but you wish that it didn’t.” When dialing a number, there is no delete if you make a mistake. In addition, the dialpad changes to an awkward double row of numbers when you’ve already entered a few of the numbers you’re trying to call. In short, there is zero usability advantage in the fact that the Ensemble has a touchscreen, and usually it is less usable than the clunky CapTel 840i.

ClearCaptions Ensemble

ClearCaptions Ensemble

Captioning Quality

Obviously this is the most critical aspect of a captioned phone. Below I’ve posted a video with a side-by-side comparison. I used a YouTube video of a person speaking, to ensure that the audio was identical for each trial. Go ahead and check out the video first. I apologize in advance for some of the shaky camera work. My hands were starting to get very tired (see below for an explanation).

As you can see, the speed and accuracy of the CapTel phone is superior to ClearCaptions. Not seen here is the dozen or so trials I did with the ClearCaptions phone, using a different, lower quality video that better portrayed a one-sided phone call. Most of the time, the ClearCaptions phone did not caption anything, and I had to start the call again. The CapTel phone never had any issues with the other video. (This is why my hands (and I) were getting so tired/shaky.)

Additionally, one of the aspects of the ClearCaptions phone that I was excited about is that it supposedly integrates human captioning with automatic/computer-generated captioning. This supposedly makes it faster.As a computational linguist/NLPer, this sounded great! However, as can be seen above, there is no speed or accuracy advantage. When making real calls with the ClearCaptions phone, there are many times when the automatic captions are completely incomprehensible.

Conclusion

While I love sleekness and gadgetry in my smartphone, the most important aspect of a captioned landline phone is reliability: It just has to work. The CapTel phone works faster and more consistently. That’s really all I need to know.

  • Hannibal Smith

    What about the CaptionCall? Let me see if I have this right. ClearCaptions and CaptionCall use voice recognition of the relay operator who is listening to the called party; Sprint CapTel and InnoCaptions use live stenographers?

    • angoodkind

      I wasn’t even aware of CaptionCall. Are they new?

      • Hannibal Smith

        No, they seem to be older than ClearCaptions. I’m guessing their first generation touch-screen phone was not very good so they haven’t been a major player until relatively recently.

        My beef with either kind is I don’t want a human [re]transcribing the called party; I want voice recognition directly to the called party with human monitoring to correct mistakes if need be. So its a rather disappointing state of affairs. I hope it isn’t a legal government requirement to have a live human always be involved transcribing.

        • Samantha Robles

          Hannibal, may I ask why you would rather a machine transcribe instead of a human? I hope it IS required to have people do it and not just machines

          • angoodkind

            Machines are much much faster. At the moment they’re still slightly less accurate than humans, but as automatic speech recognition continues to get better, that gap will shrink and disappear. Also, with a human, there is someone else listening in on the call, which freaks some people out.

          • Samantha Robles

            Do you think a machine will be faster when a person has an extremely thick accent, is using very improper English, stuttering, etc etc? A human being may make some people fee a bit odd, but when it comes down to it, it’s more efficient. If a person is mispronouncing a word, a machine is going to print whatever it thinks is being said. Whereas a person can differentiate and be able to tell what the person is trying to say and that they may just be mispronouncing a word. Keep in mind also, even though a person is listening, that person has been trained very extensively, there is NOTHING that can identify the caller (no phone number, names, etc are shared with the captioning agent) and nothing is recorded. The FCC regulates everything.

          • angoodkind

            Samantha – I work in artificial intelligence and automated speech recognition. Of course, at the moment, and maybe for the next 5 years, a person is more accurate in those cases. However, improvements are happening very rapidly with the technology, and computers have already been shown to be more accurate than humans on some datasets. It is only a matter of time until computers can handle all types of situations better.