IMLD 2019: “Once words are written, they can’t be taken back”

For the International Mother Language Day 2019, 21st February, we interviewed Tisane Labs CEO, Vadim Berman.

Q: Vadim, you are a co-founder and the CEO of Tisane Labs, that provides text analytics for over 20 languages. How did you get involved with languages?

Vadim Berman, CEO, Tisane Labs

VB: One could describe me as a “serial immigrant” or “ethnically confused” 😊. I lived in 6 countries; my first move happened back in 1991 from then-Soviet Union to Israel, when I was in my teens. I had to master Hebrew at school; I was also the family translator. I was upset about the fact that my English, which I thought was pretty good, was nowhere near the local level. I sat with a textbook and then forced myself to manage a diary in English.

Later, when I started my career as a software engineer, I became obsessed with tying between languages and software. The first push to explore the world of natural language processing was an article in Wired back in 2000 called Talking to Strangers. (Looks like it’s still there: https://www.wired.com/2000/05/translation-2/ – little did I know that I would meet some of the people mentioned there in person.) The second push was a speculative fiction story by Jorge Luis Borges called Tlön, Uqbar, Orbis Tertius, which describes an imaginary world whose inhabitants deny reality, and as a result, use languages without nouns. I was wondering how software could handle this kind of language.

I had zillions of ideas and wanted to try them all. I thought the combination of my linguistic and software development skills gives me an advantage. I started experimenting with machine translation and extraction of meaning. The interest became an obsession, the obsession then became a living, and in a few years, already living in Australia, I cofounded LinguaSys, an American venture which was later acquired by Aspect Software, a US-based multinational. I met much of my current team in LinguaSys. I went on to start Tisane Labs in 2017, after I left Aspect.

 

Q: Tisane stands apart supporting so many different languages. Can you explain why?

VB: In one colloquial sentence, because we can and because we have to.

The decision to focus on multilingualism was both motivated by the economics and the possible applications. There is a shared linguistically neutral core, and so a new language is not started from zero. There are multiple devices to easily reuse shared elements between language models.

When the less mainstream languages take less effort to add, they become more economically viable. As these markets are often underserved, we can benefit from less competition and more coverage.

While there are many use cases when one or two languages are enough, in many scenarios like hospitality industry, no matter how small your business is, ignoring most languages means ignoring a significant segment of your customer base.

 

Q: Can you give us some interesting examples of something similar / different among these languages?

VB: Funny enough, the biggest similarity is that in every language, many native speakers believe that their language is the most unique, the most complex, and deserves the most special treatment.

My focus is on the inner workings of languages. I see them not as an amorphous bag of words, but as a set of cogs, levers, and pulleys.

On the structural level, after a while it looks like different languages borrow from more or less the same bag of tricks. There are lots of differences, of course, but when we add support to a new language, we find that a new phenomenon can be handled the same way as a seemingly different phenomenon in a different language. For example, when you think of it, English compound verbs (like “give away” or “tell off”) behave somewhat similarly to the German and Dutch separable verbs (e.g. “ankommen” in German), and so the ways we handled both are very similar.

All in all, languages are influenced by the national mindset. If the speakers see a need for a word or a structure, it will be invented. If they like nuance, one word will have less interpretations. At one point, my team was working with hotel and restaurant reviews, which are a pretty good representation of how an average person thinks. We had a blast looking at all the different aspects. There were lots of Swedish reviews for some reason obsessed with glass doors in the bathroom. Russians were keen to write long, creative travelogues. My personal favourite was a French review of a fine dining establishment, which was ending with the following, “It was almost perfect. Almost! The tiny flaw we found was that the antenna of the lobster was broken. All the rest was wonderful, and I would’ve given 9.5 out of 10, but because there is no way to deduct a fraction of a point, I give 9 out of 10.”

 

Q: The International Mother Language Day focuses on indigenous languages this year. What do you think AI could bring to indigenous languages?

VB: Most importantly, the promise not to be forgotten.

By some estimates, every two weeks a language dies with its last speaker, and between half to every 9th language is predicted to disappear by the end of the century.

If we catalogue the language somewhere, somehow, it will exist. There is a Russian saying which can be roughly translated as, “once words are written, they can’t be taken back” (in Russian, “что написано пером, не вырубишь топором). Today, with better tools, we can preserve context and samples, we can analyse its structure and derive conclusions about its origin and learn more about the culture that created it. Maybe even be used to decipher or understand another language, as MIT researcher Regina Barzilay demonstrated with Ugaritic in 2010 (http://news.mit.edu/2010/ugaritic-barzilay-0630).

A well-documented dead language may also come back to life. There are multiple examples of dead languages that were revived: Cornish, Dalmatian, Hebrew, and more.

As AI software today is an essential part of the modern infrastructure, the lack of software support causes native speakers to “abandon ship” and adds the risk of a language to become marginalised and eventually disappear. Monolingual speakers of poorly supported languages are somewhat locked outside the global discourse.

Sadly, the natural language processing today is mostly data-hungry, and therefore, it takes a lot of time for the advances to trickle down. We see it as our mission to change that.

Tisane Labs Launches Solution to Detect Hate Speech and Cyberbullying

published on Yahoo Finance via PRNewswire

Affordable API enables developers and businesses to detect hate speech, cyberbullying, unwanted sexual advances, criminal activity, and more

WASHINGTONNov. 13, 2018 /PRNewswire/ — Tisane Labs, a supplier of text analytics AI solutions, today announced the launch of Tisane API, the first API to detect and classify abusive textual content in 27 languages. Tisane detects hate speech, personal attacks, unwanted sexual advances, and criminal activity in text, with additional varieties of detected abuse to come.

“Trolls, bigots, harassers, and criminals made the Internet an unpleasant and at times dangerous place. For the users, it often means being unsafe online with possible consequences in real life. For the online communities, it means high user turnover, additional headaches with the moderation, and enormous monetary losses or legal issues,” said Vadim Berman, Chief Executive Officer and Co-founder of Tisane Labs. “Now, with Tisane API, the communities online can automate much of the moderation process and even warn potential offenders before the post is published. Rather than producing a blanket statement and a floating-point figure, Tisane API pinpoints the actual instance of abuse and classifies the type of abuse.”

Tisane API runs in the cloud, with a simple REST interface that can be linked from any popular programming platform today. Tisane Labs provides a range of plans for every pocket with the option of a custom installation on premises and a generous FREE plan.

To try Tisane API, visit https://tisane.ai.

For more information, contact Carla Johnston (email: Carla.Johnston@tisane.ai or call: +1 (703)-628-8827)

Related Links

Tisane Labs website

Carla Johnston to join Tisane Labs as CRO

We welcome Carla Johnston, who was also a part of the former LinguaSys team, onboard as our Chief Revenue Officer. Carla brings a wealth of knowledge and experience in the natural language processing sales and enormous dedication. Carla is located in the Washington D.C. area.

Tisane Labs launches Tisane API

Tisane Labs is pleased to announce the release of Tisane API.

Harness the power of next-generation AI to extract more from text in 27 languages: detect hate speech, sexual harassment, cyberbullying, extract topics, and find not only whether, but also why the customer is happy or unhappy with your product or service. Our applications and components are accessible in the cloud on a subscription basis (SaaS), can be installed on premises, or embedded in 3rd party applications for seamless integration and security.

We support: English, Chinese (Simplified and Traditional), Arabic, Danish, German, Spanish, Persian, Finnish, French, Hebrew, Indonesian, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Pashto, Portuguese, Russian, Swedish, Thai, Turkish, Urdu, Vietnamese.

We offer several ways to use our components, from a generous free plan (not a limited trial) to enterprise-grade plans and on prem installation options. Whether you’re a small startup, an independent developer, or an enterprise, we can work together.

Questions? Browse our knowledge base, chat with us using the real-time chat widget, or email us.

Sign up and start using Tisane API today.