(Day 107) Transforming natural language to charts

Ivan Ivanov · April 17, 2024

Hello :) Today is Day 107!

A quick summary of today:

  • created text2chart - transforming natural language to charts [github] [webapp]

I am super sick, but in the around 3 hours of my fever not being high, I managed to study a bit.

I was again thinking about creating some nlp project but was stuck on the idea of how to host a model. After looking around I found this article where they create a similar app called chat2vis and they use codellama. I know codellama is an open source model that I can get from huggingface and I wondered, how did they host it. Turns out - they ask the user to input a huggingface api key, which then allows for querying and generating content. (AWESOME! this is applicable to some of the other PDF chat apps that I have, but I will look to make an app of them later).

My fever eased a bit, and I sat on my chair to do some work ^^

Here is a breakdown of the app.

I used streamlit. Also, 2 default datasets during dev: Financial Statements of Major Companies(2009-2023) from Kaggle and Boston house price data from Kaggle.

Usage

  1. Input your HuggingFace API key

image

  1. Load your CSV data (optional)

image

  1. Select data to visualize (default is the financial statements data)

image

  1. Enter a query, run and get your chart

image

I did some more experimenting as well (below are query+resulting chart):

Display the revenue, net income, and EPS of each top company for the latest fiscal year in separate vertical bar charts for easy comparison.

image

Compare the revenue, net income, and EPS of the top companies for the latest fiscal year using horizontal bar charts.

image

Group by companies in the IT category and create a pie chart showing a breakdown of each company’s market share.

image

More examples in the README of the github repo.

However there are still some limitations, that I can look into when I am better:

  • right now it can read columns as either categorial (if dtype is not int/float) or numerical (int/float), so it might have problems with dates
  • might not be able to make a more complicated chart (or may need a more specific query)

That is all for today!

See you tomorrow :)