What is the value of this article?

We are young, inexperienced, prone to fail: we’re 2 first time founders. I always prefer to learn from people who are just a few steps ahead of me.

Who is this article for?

If you’re a multi-time entrepreneur then you might not find this article interesting. If you’ve been thinking about starting your own startup then I’d recommend reading as many similar articles as possible.

Our Team

Artem (author of this article):

Maksym:


Four different approaches to design a multi-language Elasticsearch index

Photo by Joel Naren on Unsplash

When I was designing Elasticsearch index for NewsCatcherAPI, one of the biggest problems I had was handling multi-language news articles.

I knew that Elasticsearch has pre-build analyzers for the most popular languages. The question was “How do I manage to have documents with different languages that I can search all together (if needed)?”

Important: in our case, we had each document already labeled with the correct language. Still, it is not necessary for all approaches described in this post.

Also, for this post set up, let’s assume that each document (news article) has only 2 fields:title and language. Where language


Better to see something once than to hear about it a thousand times

TL;DR README of your repository should show/present your code as if it was a product that you sell.

Disclaimer: it is my personal blog. What I describe here works(ed) for me. There is no guarantee that my approach is an ultimate guide that guarantees a 100% result. Also, trying to get N GitHub stars itself is a bad approach — I just want to help you understand how to make other people notice the work you have done already.

Intro

6 months ago when I and my friend were preparing to launch a closed beta of NewsCatcherAPI I had a simple…


Why a developer should invest her time to write a clear README for their public repositories

Photo by Oxa Roxa on Unsplash

In this article, I would like to share with you my little observations on how many junior developers failed to convince me to hire them by not putting a README to their repositories.

Useful links on how to craft a stunning README at the end of the article.

What is README?

Usually, README is a file in the repository of software/project that briefly explains it.

README file is there to pitch your work.

README is not your documentation (unless it can be fitted into one page).

Why should you make it?

If you make your repository public, you most likely will be judged by it. …


NEWSCATCHER

From 0 to 350 sign-ups within 2 months

Photo by davide ragusa on Unsplash

UPD: The Python package I talk about in this article does not depend on any external API. You do not need to be our client to use it!

About two months ago, I saw a problem coming. My side-project newscatcherapi.com was soon to be ready to launch for a closed beta. But, we had 0 sign-ups in our email list.

For a little back story, Newscatcher is a Data-as-a-Service company that builds an API to search through online news articles. Just like Google searches the most relevant web pages, we return you the data on the most relevant news articles. 


Guide on how to release and sell your code without managing a website, servers, users, and payments. With 0$ up-front cost.

Photo by David Rangel on Unsplash

In this post, I will go through my experience of developing, deploying and selling my API via an API marketplace. I did not have to set up a website or think about how to integrate payment processing solutions. I just wrote my code and deployed it.

Building a startup requires a team. A team of a few jacks of all trades: coders, marketing, sales. And, it is a long and exhausting path, therefore, low chances to succeed.

You do not have to launch a startup to begin your own thing. …


NEWSCATCHER

Team of two planned and shipped a beta for 200 users in less than 2 months without quitting their full-time jobs.

newscatcherapi.com solution architecture for beta

Within 60 days we:

Official website of the product that I will talk about in this article.

In this article:

We are a team of 2 data engineers. Within February and March 2020 we dedicated most of our spare time building an API that allows you to search for the news articles’ data.

It is like querying…


NEWSCATCHER

Newscatcher python package allows you to automatically collect the latest news data from over 3,000 major online news websites.

Photo by Quinten de Graaf on Unsplash

As I am writing this article, many people have to work from home, some have a lot of free time during this period. You can use this time to build your portfolio, enhance your skills or begin a side-project.

Newscatcher package makes it easy to collect and normalize news articles data without any external dependencies. It was built while we were working on our main Data-as-a-Service product called Newscatcher API. We are the developers-first team, therefore, we open-source as much as possible so that coders can partially replicate the job we have done for free.

The way to use our…


We analyzed all online media articles from March 1, 2020, to March 7, 2020, that mention US 2020 candidates to check who got the most of the attention from media

Photo by Andy Feliciotti on Unsplash

At Politwire, we monitor and analyze the media coverage of politicians. We help political campaigns, businesses and media houses to understand politicians’ media paths.

Democratic Super Tuesday

On March 3, 2020, fourteen US states held primaries. About one-third of all delegates were involved. As you might already know, Joe Biden took back the lead by taking around 100 delegates more than Bernie Sanders.

Even though Joe Biden ended up as a winner, did it give him a significant lead of media coverage?

In the graph above, we can see the number of articles for each candidate per each day of the Super…


Or how to understand what official documentation is missing

Photo by Ehud Neuhaus on Unsplash

If I had to describe Elasticsearch in one phrase I would say something like:

When search meets analytics at scale (in near real time)

Elasticsearch is in the top 10 most popular open-source technologies at the moment. Fair enough, it unites many crucial features that are not unique itself, however, it can make the best search engine/analytics platform when combined.

More precisely, Elasticsearch has become so popular due to a combination of the following features:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store