My first open source python module - Drupal data download

2020 just arrived and it's time for something new. I have been using Drupal since 2006, so we have been together for almost 14 years! It's an impressive journey and I am look back with satisfaction - Drupal is simple to use if it fits the requirements exactly, and I stoically avoided touching any of its smelly PHP bowls. However, with the arrival of Drupal 8 I realised, that I couldn't migrate to it even my simple and straightforward sites. Therefore, I decided to slowly migrate to Django + Wagtail. This will require some coding in Python on my side, but I don't mind that.

Binary self contained logs for modern high throughput systems

The number of large software systems in the world is growing steadily. Old first generation solutions get improvements and are made more sophisticated. Completely new areas open up for automation. More systems, more lines of code. More complexity everywhere. And, inevitably, more questions about the behaviour. And, of course, more errors, from bugs in the systems themselves to malfunctioning of external dependencies and processes.

Bokeh graphs and pandas dataframe groupby object

Bokeh is a nice library, helping python web developers to visualise your data in the browser. It is on good terms with pandas, the statistical and data manipulation package beloved by data scientists. It can source points from a dataframe object directly. Unfortunately, it can't get a result of a group by object directly to display it as multiple lines, yet. But no worries, with just a few lines of code you can convince it to draw you a nice multiline graph. Take a look at the code snippet below:


Visual Studio 2017 and CMake

Microsoft is continuously improving its record with the open source community. First dumping massive chunks of .NET into github, then actually making very dedicated effort to clean it up and make it portable to Linux and Mac. Now comes another step, albeit smaller, in the same direction. Visual Studio 2017 will support CMake projects in a native way, without the need to generate .proj and .sln files first. This is great news, because it saves some effort for those working on cross-platform C++ products.

Scrapy and persistent cookie manager middleware

Scrapy is a nice python environment for web scraping, i.e. extracting information from web sites automatically by crawling them. It works best with anonymous data discovery, but nothing stops you from having active sessions as well. In fact, scrapy transparently manages cookies, which are usually used to track user sessions. Unfortunately, the sessions don't survive between runs. This, however, can be fixed quite easily by adding custom cookie middleware. Here is an example:


No more Mr JavaScript guy?

After doing some web development work recently, I have clearly remembered why I hate JavaScript so much. Not only is it ugly as a language, lacking in type checking or even decent syntax for inheritance (actually faking it with various workarounds), but the actual fact of having effectively two separate code bases - front end and back end forces you to repeat quite a bit of code. And hunting for missing variables and fields or mismatching types is the favourite pastime of JavaScript developers.

Google calendar - download all entries with Python

Google provides APIs to access its data using various languages. You can manipulate Google calendars, contacts, documents etc. Most of the time the usage is pretty straightforward, but sometimes it is not clear how to achieve a specific goal. For example, it took me some time to figure out how to download all events for a given calendar. The main reason behind the difficulty is the upper limit Google places on the number of calendar entries returned by a single query. There are API calls, which help you to overcome this constraint. Below is the relevant code for your enjoyment.