Showing posts with label NLP. Show all posts
Showing posts with label NLP. Show all posts

Tuesday, November 19, 2013

A word or two or 10 about Word Clouds

Beyond Search - Easily Generate Your Own Word Clouds

Word clouds have become inescapable, and it is easy to see why– many people find such a blending of text and visual information easy to understand. But how, exactly, can you generate one of these content confections? Smashing Apps shares its collection of “10 Amazing Word Cloud Generators.”

...

VocabGrabber is different. It doesn’t even make a particularly pretty picture. As the name implies, VocabGrabber uses your text to build a list of vocabulary words, complete with examples of usage pulled from directly from the content. This could be a useful tool for students, or anyone learning something new that comes with specialized terminology. If your learning materials are digital, a simple cut-and-paste can generate a handy list of terms and in-context examples. A valuable find in a list full of fun and useful tools.

Smashing Apps - 10 Amazing Word Cloud Generators

Smashing Apps has been featured at Wordpress Showcase. If you like Smashing Apps and would like to share your love with us so you can click here to rate us.

In this session, we are presenting 10 amazing word cloud generators for you. Word cloud can be defined as a graphical representation of word frequency, whereas word cloud generators simply are the tools to map data, such as words and tags in a visual and engaging way. These generators come with different features that include different fonts, shapes, layouts and editing capabilities.

Without any further ado, here we are presenting a fine collection of 10 amazing and useful word cloud generators for you. Leave us a comment and let us know what you think of the proliferation of design inspiration in general on the web. Your comments are always more than welcome. Let us have a look. Enjoy!

 

image

Make sure you click through as SmashingApps has done a great job with blurbs and snap for each one.

 

Related Past Post XRef:
Wordle’ing Terms of Service Agreements – How a ToS would look as a word/tag cloud
Bipin shows us that creating a tag cloud doesn't have to be hard to do (in ASP.Net)
Interactive WinForm Tag Cloud Control (Think “Cool, I can add a Word/Tag Cloud thing to my WinForm app!”)
"WordCloud - A Squarified Treemap of Word Frequency" - Something like this would be cool in a Feed Reader...
Feed Stream Analysis - Web Feed/Post Analysis to Group Like/Related Posts
WordNet
"Statistical parsing of English sentences"
"A Model for Weblog Research"

Monday, November 18, 2013

10 Professionals, 10 views on the coming trends in text analytics

KDNuggets - Top 10 trends in text analytics

Data Driven Business recently interviewed forward thinking text analytics professionals from leading companies like Bank of America, Home Depot and PayPal, on challenges they are face, overcoming them, and the industry as a whole.
Alesia Siuchykava, Data Driven Business/Text Analytics News, Nov 13, 2013.

image

Data Driven Business recently conducted interviews with text analytics professionals from a number of leading companies and identified 10 trends in text analytics that can be observed over the next 6-12 months.

1. Fusion of text (unstructured) data with structured data ...

2. Increase in interest in multilingual text analytics. ...

3. Algorithmic understanding of social media comments. ...

4. Commercialization of sentiment detection. ...

5. Finding trends and trending events in news streams. ...

6. More built-in visualization capabilities. ...

7. Streaming real-time text analytics. ...

8. Use of text analytics for getting insights from unstructured Big Data. ...

9. Advances in machine learning. ...

10. Integration of different capabilities. ...

Some interesting thoughts in, and on, these trends. Note the common themes? Big Data, real-time, social...

Monday, October 28, 2013

"Theory and Applications for Advanced Text Mining" Open eBook...

Intech - Computer and Information Science - Information and Knowledge Engineering - Theory and Applications for Advanced Text Mining

Edited by Shigeaki Sakurai, ISBN 978-953-51-0852-8, 218 pages, Publisher: InTech, Chapters published November 21, 2012 under CC BY 3.0 license DOI: 10.5772/3115

Due to the growth of computer technologies and web technologies, we can easily collect and store large amounts of text data. We can believe that the data include useful knowledge. Text mining techniques have been studied aggressively in order to extract the knowledge from the data since late 1990s. Even if many important techniques have been developed, the text mining research field continues to expand for the needs arising from various application fields. This book is composed of 9 chapters introducing advanced text mining techniques. They are various techniques from relation extraction to under or less resourced language. I believe that this book will give new knowledge in the text mining field and help many readers open their new research fields.

image

Just published last year, this free eBook looks interesting (well to me anyway). Most of it is way over my head, but there's enough here that it looks like a good set of reads... Also the WordNet chapter, Ontology Learning Using Word Net Lexical Expansion and Text Mining, caught my eye.

(via KDNuggets - Free Book: Theory and Applications for Advanced Text Mining)

 

Related Past Post XRef:
WordNet
"Statistical parsing of English sentences"
Feed Stream Analysis - Web Feed/Post Analysis to Group Like/Related Posts
SharpEntropy - Maximum Entropy Modeling

Mix OpenNLP, IKVM.Net and C# and you get some noun phrase and contextual relevance goodness

Friday, October 04, 2013

Comparing Sentiment Analysis REST API's

Skyttle Blog - A tool for evaluating Sentiment Analysis REST APIs

There is a growing number of Sentiment Analysis REST APIs out there, and the potential user is faced with a lot of choice. Accuracy of analysis is the most important factor, and the best way to see if an analyzer will perform well in the intended task, is to run different analyzers on a sample of your data, and compare their output with manually assigned sentiment labels.

To make it easier for potential users to run such experiments, we’ve released a small open-source project. The project implements clients to several Sentiment Analyzers: Alchemy, Bitext, Chatterbox, Datumbox, Repustate, Semantria, Skyttle, and Viralheat. As input, it takes a text file with short texts, each annotated as positive, negative or neutral, and outputs a spreadsheet where responses of each API are recorded, as well as an accuracy rate and an error rate calculated against the manual labels.

The project is available on github: https://github.com/skyttle/sentiment-evaluation. Once you clone/unpack it, you will need to install requirements:

...

SemantAPI - Semantapi.Robot

SemantAPI is a free, open source toolkit intended for a quick and easy comparison of the most popular NLP and sentiment analysis solutions on the market. The toolkit offers 2 independent analysis applications: SemantAPI.Robot and SemantAPI.Human. Both applications are written in C# and based on Microsoft’s .Net framework 3.5 platform.

Redistributable package of SemantAPI toolkit can be downloaded here.
The source code is available on GitHub here.

SemantAPI.Robot is an application that takes the specified source file and runs an analysis of every line therein, using the selected services.

The results are generated in a regular CSV file, with two columns per selected service:

  • The “sentiment score” column contains float sentiment values provided by the target service, which can be used for precise sentiment analysis.
  • The “sentiment polarity” value contains a verbal representation of the sentiment score, making it easy to read and understand at a glance.

The current version of the SemantAPI.Robot application supports the following NLP solutions:

  • Semantria. Modern, fast-growing NLP solution based on Lexalytics’ Salience engine.
  • AlchemyAPI. One of the world’s most popular NLP solutions.
  • Chatterbox. Social technology engine that uses machine learning for sentiment analysis.
  • Viralheat. Social media monitoring solution that offers a sentiment analysis API for 3rd-party integrators.
  • Bitext. Semantic technologies solution with a sentiment analysis API that claims to have the highest accuracy on the market.

This is a day job kind of thing, one that I'm seeing more chatter and discussion about. In house we've licensed one library, and I've built a couple Proof of Concept app's with it. But I was doing so, kind of in a vacuum, not being able to compare the results against another platform. We, with this, I now can!

That and I just like the idea of these service and having the C# to access them all... :)

Wednesday, August 29, 2012

Mix OpenNLP, IKVM.Net and C# and you get some noun phrase and contextual relevance goodness

randonom - Extracting noun phrases with contextual relevance in .NET using OpenNLP

A few months ago I was working on a project that had a word cloud-like feature. A word cloud is an interesting way to visually represent a popular theme or topic. I had a dataset of user reviews from another project that we wanted to parse and use. This began my first exposure to Natural Language Processing (NLP) and other advanced text analytics tools.

...

A viable .NET implementation

Eventually I came across a wiki article entitled “A quick guide to using OpenNLP from .NET” that introduced me to a remarkable project called IKVM.NET. After generating a shiney new .NET OpenNLP assembly with the steps provided I was able to use the OpenNLP namespaces with ease in my project.

The first step in using the parsers in OpenNLP was to instantiate a model using Java streams. I created a base class for my NounPhraseParser with a utility method to help load these models.

...

Conclusion

I think this project worked out remarkably well. I don’t know if I’ll attempt to use something like this in a production environment, but if nothing else it was a very enlightening foray into the interesting world of Natural Language Processing. There are many other subjects in this area that I would like to explore, such as Sentiment Analysis and ways to identify subjects of significance in large bodies of text. As the IBM Watson project demonstrated to us not too long ago, this is a young field with staggering potential. The current trajectory of research along with significant advances in computation capability suggest it won’t be long before we can communicate with computers/information systems as easily as if you were talking to your best friend.

...

image..."

I can't believe it's been 6 years since I've blogged about OpenNLP (sigh, and I've still not worked on the project I had meant to when watching for it then... It's on the list still... but...). Anyway... If you've wanted to do natural language processing (NLP) and are looking for options, then check out Sean's post...

(via DotnetKicks - Extracting noun phrases with context in .NET using OpenNLP)

 

Related Past Post XRef:
SharpEntropy - Maximum Entropy Modeling
"Statistical parsing of English sentences"
WordNet

Java for .Net? Yep, the IKVM.NET way...
Java for .Net? Ja!
Java Implementation for Mono/.Net (IVKM.Net)

Thursday, February 09, 2012

NLP is Hard... But with AboditNLP it's not as...

Elegant Code - NuGet Project Uncovered: AboditNLP

"If you are coming to this series of posts for the first time you might check out my introductory post for a little context.

AboditNLP is a Natural Language Processor library. This kind of stuff in interesting, but not something I have chosen to spend my time on.

It has a demo http://nlp.abodit.com/home/demo which gives you sample things to type to the library. ..."

AboditNLP

Natural Language Conversations

A conversational natural language engine allows humans and computers to converse in a natural way using typed messages. These messages may be exchanged using SMS, XMPP chat, email, or web-based chat.

Conversational interfaces are an improvement over form-filling interfaces for many applications. One of the best known examples of this is the input box on Google calendar where you can type something like '10:20 meeting at Larry's' and it will create a meeting starting at 10:20 and will set the location to 'Larry's'. This particulare example is a hybrid interface where a conversational element has been integrated into a traditional form-based interface. A pure conversation interface is often called a 'chatbot'.

Conversational interfaces are particularly well suited to mobile applications where the small screen, tiny keyboard and lack of a mouse makes traditional form-filling a tedious and error prone experience.

Consider for example the simple request what orders did we receive last month on a friday after 4pm. What would the dialog look like to specify a query like that? What if, instead, your users could simply enter what they are looking for? That's what a conversational natural language engine can do for your business.

NLP is hard

General purpose NLP is a really hard problem but for a specific application domain (like CRM integration, product support, home automation, ...) it's possible to define a sufficiently large recognition base that you can provide a good experience to your users. This library is focused on providing you the tools you need to create such domain specific chatbots or to add natural language capable input boxes to your traditional forms-based applications. ...

...

Software and Licensing

This Natural Language engine will soon be available for download and integration in your .NET projects. For personal, non-commercial projects there is no charge. For use in commercial applications and for any consulting requirements please ..."

AboditNLP - Natural Language Interface to Home Automation

"Rather than hunting through a multi-layer web-page-by-web-page interface all aspects of a home can be controlled and/or queried directly using single line commands issued from a smartphone or computer using SMS or a chat client like Google Talk.

Examples of things you can do when you connect a home automation system to this natural language engine can be seen in this actual dialog with my own home automation system:-

SNAGHTML1b205722..."

Interesting... I can see the applications for this already. Image speech to text and then mix this in? hum... (of course that could lead to some pretty funny results too...lol)

 

Related Past Post XRef:
Feed Stream Analysis - Web Feed/Post Analysis to Group Like/Related Posts
SharpEntropy - Maximum Entropy Modeling
"WordCloud - A Squarified Treemap of Word Frequency" - Something like this would be cool in a Feed Reader...
WordNet
"Statistical parsing of English sentences"
"A Model for Weblog Research"
AddressOf.com - MS Research TreeMap.Net