Category: CART

Comparing Captioned Phones: ClearCaptions Vs. CapTel

The interwebs are severely lacking any objective comparisons of the two major captioned (landline) telephones on the market: The ClearCaptions Ensemble and the CapTel 840i.

I’ve been using the CapTel for a few years, but the ClearCaptions Ensemble has a 90-day trial period. So I figured I had nothing to lose by trying it out.

Before I get to the captioning quality, which is admittedly the most important aspect, here are a few notes on other aspects of the phones:

Appearance/User Interface
CapTel 840i

CapTel 840i

The CapTel phone is decidedly unsexy. It is pretty old, large and clumsy. I’ve been using the 840i, which does not have a touch screen, for a few years. However, when I went to the CapTel site, I see they now have a touchscreen version, the 2400i. I may have to try that in the future.

The ClearCaptions Ensemble looks much nicer. It is also all touchscreen, except for the power button. However, the touchscreen interface is horrendous. As I said to a friend, “It has a touch interface, but you wish that it didn’t.” When dialing a number, there is no delete if you make a mistake. In addition, the dialpad changes to an awkward double row of numbers when you’ve already entered a few of the numbers you’re trying to call. In short, there is zero usability advantage in the fact that the Ensemble has a touchscreen, and usually it is less usable than the clunky CapTel 840i.

ClearCaptions Ensemble

ClearCaptions Ensemble

Captioning Quality

Obviously this is the most critical aspect of a captioned phone. Below I’ve posted a video with a side-by-side comparison. I used a YouTube video of a person speaking, to ensure that the audio was identical for each trial. Go ahead and check out the video first. I apologize in advance for some of the shaky camera work. My hands were starting to get very tired (see below for an explanation).

As you can see, the speed and accuracy of the CapTel phone is superior to ClearCaptions. Not seen here is the dozen or so trials I did with the ClearCaptions phone, using a different, lower quality video that better portrayed a one-sided phone call. Most of the time, the ClearCaptions phone did not caption anything, and I had to start the call again. The CapTel phone never had any issues with the other video. (This is why my hands (and I) were getting so tired/shaky.)

Additionally, one of the aspects of the ClearCaptions phone that I was excited about is that it supposedly integrates human captioning with automatic/computer-generated captioning. This supposedly makes it faster.As a computational linguist/NLPer, this sounded great! However, as can be seen above, there is no speed or accuracy advantage. When making real calls with the ClearCaptions phone, there are many times when the automatic captions are completely incomprehensible.


While I love sleekness and gadgetry in my smartphone, the most important aspect of a captioned landline phone is reliability: It just has to work. The CapTel phone works faster and more consistently. That’s really all I need to know.

Captioning Around the Country: CART vs C-Print

In the past 6 weeks, I have interviewed or attended Open Houses at 8 different schools around the country. Don’t get me wrong, I am flattered and humbled by the positive responses I received from my PhD applications.

But: It. Was. Exhausting.

Nonetheless, it provided an opportunity to try out different captioning systems and see what captioning is like in places that are not New York City.

First off, at every school I visited, I was able to secure captioning accommodations. It’s a good lesson that as long as you’re proactive and explain exactly what you need, most schools are able to comply. Thank you to all of the administrators and coordinators who helped set this up.

That being said, all captioning is not created equal. The experience made me realize that I’ve been pretty spoiled in New York City, with a relative abundance of well-qualified captionists at my disposal. The following bullet points are largely to serve as a comparison of CART captioning and C-Print, because after extensive googling I found zero qualitative comparisons.

  • The first observation is not a comparison. Rather, it is a direct experience with the phenomena of “discount captioners,” as described by Mirabai Knight, one of the most proficient and strongly activist captionists I’ve used. So-called “CART firms” will troll court reporting schools for mid-level students and use them to charge cash-strapped schools extremely low rates. The result is a terrible experience for students, and a blemish on the reputation of CART captioning.
    • At one school, I actually pulled a professor aside as we were changing rooms and said, “I’m going to have to rely 100% on reading your lips, because I have literally no idea what the captioner is writing.” As Mirabai’s article explains, this is unfortunately all too common, as many schools do not realize that only highly-proficient, highly-trained captioners can provide a sufficient experience for deaf and hard-of-hearing students.
  • CART vs C-Print
    • Mirabai provides a bunch of great reasons why C-Print can fall short of CART captioning. I only used C-Print twice, whereas I’ve been using CART multiple times a week for the better part of 3 years. I’d strongly encourage anyone interested to check out Mirabai’s article.
      • Overall, C-Print was…fine. But when it comes to hearing, “fine” ≠ “adequate.”
      • C-Print does not advertise itself as a literal, word-for-word transcription. Rather, they only “ensure” that the most important things are transcribed. But “importance” is completely at the discretion of the captioner. There were a few occasions where I know the C-Print captioner did not transcribe words that I would consider important, such as the name of an institution where a researcher was located.
      • A C-Print captionist uses a QWERTY keyboard, and depends on a program where they type many abbreviations that the program expands to full words. This usually works well enough, but C-Print is definitely at least 1-2 seconds slower than CART. While 1-2 seconds may not sound like a long time, I would defy you to try having a conversation with someone where things lag 1-2 seconds behind. You’ll quickly see just how significant 1-2 seconds can be.
      • C-Print can be advantageous in noisy situations where an in-person captioner is not available. I used C-Print at a lunch, in an environment that definitely could not have used remote captioning. In this case, a slower, more summarizing transcription is better than a word-for-word transcription that cannot eliminate a high level of background noise.

tl;dr: C-Print captioning is an okay substitution when in-person captioning is not available. But in no way should an institution feel that providing C-Print captioning is the equivalent of providing the transcription provided by CART captioning.

NAACL ’15 Roundup

I just returned from NAACL 2015 in beautiful Denver, CO. This was my first “big” conference, so I didn’t know quite what to expect. Needless to say, I was blown away (for better or for worse).

First, a side note: I’d like to thank the NAACL and specifically the conference chair Rada Mihalcea for providing captions during the entirety of the conference. Although there were some technical hiccups, we all got through them. Moreover, Hal Daume and the rest of the NAACL board were extrememly receptive to expanding accessibility going forward. I look forward to working with all of them.

Since this was my first “big” conference, this is also my first “big” conference writeup. Let’s see how it goes.

Packed ballroom for keynote

Keynote #1: Lillian Lee Big Data Pragmatics etc….

  • This was a really fun and insightful talk to open the conference. There were a few themes within Lillian’s talk, but my two favorite were why movie quotes become popular and why we use hedging. Regarding the first topic, my favorite quote was: “When Arnold says, ‘I’ll be back’, everyone talked about it. When I say ‘I’ll be back’, you guys are like ‘Well, don’t rush!'”
  • The other theme I really enjoyed was “hedging” and why we do it. I find this topic fascinating, since it’s all around us. For instance, in saying “I’d claim it’s 200 yards away” we add no new information with I’d claim.” So why do we say it? I think this is also a hallmark of hipster-speak, e.g. “This is maybe the best bacon I’ve ever had.”

Ehsan Mohammady Ardehaly & Aron Culotta Inferring latent attributes of Twitter users with label regularization

  • This paper uses a lightly-supervised method to infer attributes like age and political orientation. It therefore avoids the need for costly annotation. One way that they infer attributes is by determining which Twitter accounts are central to a certain class. Really interesting, and I need to read the paper in-depth to fully understand it.

One Minute Madness

  • This was fun. Everyone who presented a poster had one minute to preview/explain their poster. Some “presentations” were funny and some really pushed the 60-second mark. Joel Tetreault did a nice job enforcing the time limit. Here’s a picture of the “lineup” of speakers.

Long line of speakers

Nathan Schneider & Noah Smith A Corpus and Model Integrating Multiword Expressions and Supersenses

  • Nathan Schneider has been doing some really interesting semantic work, whether on FrameNet or MWEs. Here, the CMU folks did a ton of manual annotation of the “supersense” of words and MWEs. Not only do they manage to achieve some really impressive results on tagging of MWES, but they also have provided a really valuable resource to the MWE community in the form of their STREUSLE 2.0 corpus of annotated MWEs/supersenses.

Keynote #2: Fei-Fei Li A Quest for Visual Intelligence in Computers

  • This was a fascinating talk. The idea here is to combine image recognition with semantics/NLP. For a computer to really “identify” something, it has to understand its meaning; pixel values are not “meaning.” I wish I had taken better notes, but Fei-Fei’s lab was able to achieve some incredibly impressive results. Of course, even the best image recognition makes some (adorable) mistakes.
baby holding a toothbrush is mislabelled as

Middle caption: “a young boy is holding a baseball bat”

Manaal Faruqui et al. Retrofitting Word Vectors to Semantic Lexicons

  • This was one of the papers that won a Best Paper Award, and for good reason. It addresses a fundamental conflict in computational linguistics, specifically within computational semantics: distributional meaning representation vs. lexical semantics. The authors combine distributional vector representation with information from lexicons such as WordNet and FrameNet, and achieve significantly higher accuracy in semantic evaluation tasks from multiple languages. Moreover, their methods are highly modular, and they have made their tools available online. This is something I look forward to tinkering around with.

Some posters that I really enjoyed

Overall impressions

  • Deep learning and neural nets are still breaking new ground in NLP. If you’re in the NLP domain, it would behoove you to gain a solid understanding of them, because they can achieve some incredibly impressive results.
  • Word embeddings: The running joke throughout the conference was that if you wanted your paper to be accepted, it had to include “word embeddings” in the title. Embeddings were everywhere (I think I saw somewhere that ~30% of the posters included this is their title). Even Chris Manning felt the need to comment on this in his talk/on Twitter:

Takeaways for Future Conferences

  • I should’ve read more of the papers beforehand. Then I would have been better prepared to ask good questions and get more out of the presentations.
  • As Andrew warned me beforehand “You will burn out.” And he was right. There’s no way to fully absorb every paper at every talk you attend. At some point, it becomes beneficial to just take a breather and do nothing. I did this Wednesday morning, and I’m really glad I did it.
  • Get to breakfast early. If you come downstairs 10 minutes before the first session, you’ll be scraping the (literal) bottom of the barrel on the buffet line.

Shameless self-citation: Here is the paper Andrew and I wrote for the conference.