LanguageTool is an open-source grammar tool, also known as the spellchecker for OpenOffice. This library allows you to detect grammar errors and spelling mistakes through a Python script or through a command-line interface. We will work with the language_tool_pyton python package which can be installed with the pip install language-tool-python
command. By default, language_tool_python
will download a LanguageTool server .jar
and run that in the background to detect grammar errors locally. However, LanguageTool also offers a Public HTTP Proofreading API that is supported as well but there is a restriction in the number of calls.
We will provide a practical example of how you can detect your grammar mistakes and also correct them. We will work with the following…
In a previous post, we showed how we could do text summarization with transformers. Here, we will provide you an example, of how we can use transformers for question answering. We will work with the huggingface library. We will work with Google Colab, so the example is reproducible. First, we need to install the libraries:
!pip install transformers
!pip install torch
Now, we are ready to work with the transformers.
For an AI algorithm to be able to answer questions from an input text, it means that it is able to “understand”, that is why we call it Natural Language Understanding ( NLU) and to be able to respond by generating a text, that is why we call it Natural Language Generation ( NLG). …
Data Scientists use to work with notebooks like Jupyter and RMarkdown. Through notebooks, they can easily share their analysis in HTML format. But what about when there is a need to share the notebooks publicly? In this case, the most convenient way is to configure an Amazon S3 bucket to function as a static website. In this tutorial, we will provide you a walkthrough example of how you can share your notebooks as a static website with AWS S3.
The report should be in HTML format. Let’s create a dummy report in R using RMarkdown. Let’s create the Rmd report:
…
We have provided a walkthrough example of Text Summarization with Gensim. Today, we will provide an example of Text Summarization using transformers with HuggingFace library. The theory of the transformers is out of the scope of this post since our goal is to provide you a practical example.
For this example, we will try to summarize the plot from the Fight Club movie that we got it from the Wikipedia Movie Plot dataset and we also worked on it for the GloVe model.
Let’s have a look at the long plot:
The unnamed Narrator is a traveling automobile recall specialist who suffers from insomnia. When he is unsuccessful at receiving medical assistance for it, the admonishing doctor suggests he realize his relatively small amount of suffering by visiting a support group for testicular cancer victims. The group assumes that he, too, is affected like they are, and he spontaneously weeps into the nurturing arms of another man, finding a freedom from the catharsis that relieves his insomnia. He decides to participate in support groups of various kinds, always allowing the groups to assume that he suffers what they do. However, he begins to notice another impostor, Marla Singer, whose presence reminds him that he is attending these groups dishonestly, and this disturbs his bliss. The two negotiate to avoid their attending the same groups, but, before going their separate ways, Marla gives him her phone number.
On a flight home from a business trip, the Narrator meets Tyler Durden, a soap salesman with whom he begins to converse after noticing the two share the same kind of briefcase. After the flight, the Narrator returns home to find that his apartment has been destroyed by an explosion. With no one else to contact, he calls Tyler, and they meet at a bar. After a conversation about consumerism, outside the bar, Tyler chastises the Narrator for his timidity about needing a place to stay. Tyler requests that the Narrator hit him, which leads the two to engage in a fistfight. The Narrator moves into Tyler’s home, a large dilapidated house in an industrial area of their city. They have further fights outside the bar on subsequent nights, and these fights attract growing crowds of men. The fighting eventually moves to the bar’s basement where the men form a club (“Fight Club”) which routinely meets only to provide an opportunity for the men to fight recreationally.
Marla overdoses on pills and telephones the Narrator for help; he eventually ignores her, leaving his phone receiver without disconnecting. Tyler notices the phone soon after, talks to her and goes to her apartment to save her. Tyler and Marla become sexually involved. He warns the Narrator never to talk to Marla about him. More fight clubs form across the country and, under Tyler’s leadership (and without the Narrator’s knowledge), they become an anti-materialist and anti-corporate organization, Project Mayhem, with many of the former local Fight Club members moving into the dilapidated house and improving it.
The Narrator complains to Tyler about Tyler excluding him from the newer manifestation of the Fight Club organization Project Mayhem. Soon after, Tyler leaves the house without notice. When a member of Project Mayhem is killed by the police during a botched sabotage operation, the Narrator tries to shut down the project. Seeking Tyler, he follows evidence of Tyler’s national travels. In one city, a Project Mayhem member greets the Narrator as Tyler Durden. The Narrator calls Marla from his hotel room and discovers that Marla also believes him to be Tyler. Tyler suddenly appears in his hotel room, and reveals that they are dissociated personalities in the same body. When the Narrator has believed himself to be asleep, Tyler has been controlling his body and traveling to different locations.
The Narrator blacks out after the conversation, and when he awakes, he uncovers Tyler’s plans to erase debt by destroying buildings that contain credit card companies’ records. The Narrator tries to warn the police, but he finds that these officers are members of the Project. He attempts to disarm the explosives in a building, but Tyler subdues him and moves him to the uppermost floor. Held at gunpoint by Tyler, the Narrator realizes that, in sharing the same body with Tyler, he himself is actually in control holding “Tyler’s” gun. The Narrator fires it into his own mouth, shooting through the cheek without killing himself. Tyler collapses with an exit wound to the back of his head, and the Narrator stops mentally projecting him. Afterward, Project Mayhem members bring a kidnapped Marla to him, believing him to be Tyler, and leave them alone. Holding hands, the Narrator and Marla watch as the explosives detonate, collapsing many buildings around them. …
IBM Watson has built a model for analyzing personalities. It takes as input texts and it returns some personality insights by applying linguistic analytics and personality theory to infer attributes from a person’s unstructured text. You can play with the personality insights demo of IBM Watson just to get an idea of what analytics it provides. In this tutorial, we will show how you get the Personality Insights from IBM Watson using Python.
First, you will need to create an IBM Cloud account to get an API Key. …
A new trend in the Data Science and Data Engineering world is the term of “ Data Lakes “. According to Wikipedia:
A data lake is a system or repository of data stored in its natural/raw format. A data lake is usually a single store of data including raw copies of source system data, sensor data, social data etc and transformed data used for tasks such as reporting, visualization, advanced analytics and machine learning. …
I attended the Introduction to Designing Data Lakes in AWS course in Coursera where there was a lab about Glue and I found it very useful and that is why I decided to share it here.
In this tutorial we will show how:
The first thing that you need to do is to create an S3 bucket. For this example, I have created an S3 bucket called…
Disclaimer: This post contains affiliate links
In August 2019 I launched my Data Science blog Predictive Hacks. I have also shared my “Journey as a Data Science Blogger” in Medium. The blog is about Data Science offering some tips and tutorials mainly in R and Python. In the beginning, I decided to keep my blog “clean” without any ads. Progressively, the blog started to have constant growth in traffic. More particularly, in January 2020 the weekly organic users were 350, in June 2020 were 750, in September 2020 were 1400 and in December 2020 were 2600. Thus, during December 2020 there were 10K organic users. …
Disclaimer: I do not believe that any ML and AI model can predict the future price of the stocks. In this tutorial, I provide an example of how you can apply LSTM models for predicting stock prices. Note that an LSTM model has many variations, depending on the architecture and the hyperparameters. Sorry for not helping you to become rich :-)
In a previous post, we explained how to predict stock prices using machine learning models. Today, we will show how we can use advanced artificial intelligence models such as the Long-Short Term Memory (LSTM). …
Statistical arbitrage trading is a quantitative and computational approach to equity trading which is widely applied by hedge funds to produce market-neutral returns. The simplest and most popular version of the strategy is known as pairs trading and involves the identification of pairs of assets that are believed to have some long-run equilibrium relationship. By taking an appropriate long-short position on this pair when the spread has diverged sufficiently from the equilibrium value, a profit will be made if the spread converges back to equilibrium by unwinding the position. Similar ideas govern more complicated strategies that consider a larger basket of assets. We will focus on pairs trading strategy endeavoring to
specify precisely the concept of the long-run equilibrium relationship between two stocks and then we try to describe and apply a computational methodology for modeling the mispricing dynamics. …
About