
    
        
        
        
                
        
        
        
            
{"version":"https:\/\/jsonfeed.org\/version\/1","title":"mathspp.com feed","home_page_url":"https:\/\/mathspp.com\/blog\/tags\/polars","feed_url":"https:\/\/mathspp.com\/blog\/tags\/polars.json","description":"Stay up-to-date with the articles on mathematics and programming that get published to mathspp.com.","author":{"name":"Rodrigo Gir\u00e3o Serr\u00e3o"},"items":[{"title":"TIL #114 \u2013 Implicit casting in dataframe concatenation","date_published":"2025-01-23T18:06:00+01:00","id":"https:\/\/mathspp.com\/blog\/til\/implicit-casting-in-dataframe-concatenation","url":"https:\/\/mathspp.com\/blog\/til\/implicit-casting-in-dataframe-concatenation","content_html":"<p>Today I learned that Polars allows non-strict vertical concatenation of dataframes with the parameter <code>how=\"vertical\"<\/code>.<\/p>\n\n<h2 id=\"implicit-casting-in-dataframe-concatenation\">Implicit casting in dataframe concatenation<a href=\"#implicit-casting-in-dataframe-concatenation\" class=\"toc-anchor after\" data-anchor-icon=\"#\" aria-label=\"Anchor\"><\/a><\/h2>\n<p>Polars dataframes have an associated schema, a piece of metadata that describes the columns and their types:<\/p>\n<pre><code class=\"language-py\">import polars as pl\n\nclose_family = pl.DataFrame(\n    {\n        \"name\": [\"John\", \"Anne\"],\n        \"age\": [27, 35],\n    }\n)\n\nprint(close_family.schema)\n## Schema({'name': String, 'age': Int64})<\/code><\/pre>\n<p>By default, Polars uses the type <code>pl.Int64<\/code> when a column contains integers.\nHowever, since ages don't tend to get very big, and because they're never negative, it's enough to use the data type <code>pl.UInt8<\/code>:<\/p>\n<pre><code class=\"language-py\">extended_family = pl.DataFrame(\n    {\n        \"name\": [\"Rob\", \"Jessica\"],\n        \"age\": [47, 28],\n    },\n    schema_overrides={\n        \"age\": pl.UInt8,\n    },\n)\n\nprint(extended_family.schema)\n## Schema({'name': String, 'age': UInt8})<\/code><\/pre>\n<p>Now, if I try to use <code>pl.concat<\/code> to concatenate these two vertically, Polars complains because the columns <code>age<\/code> in both dataframes have different types:<\/p>\n<pre><code class=\"language-py\">pl.concat([close_family, extended_family], how=\"vertical\")<\/code><\/pre>\n<pre><code>polars.exceptions.SchemaError: type UInt8 is incompatible with expected type Int64<\/code><\/pre>\n<p>Polars is very strict about data types (and rightfully so) and that is why it complains.\nIn many situations, you can ask Polars to be more lenient by specifying <code>strict=False<\/code> but <code>pl.concat<\/code> does not support this argument.\nInstead, today I learned that it supports <code>how=\"relaxed\"<\/code><sup id=\"fnref1:1\"><a href=\"#fn:1\" class=\"footnote-ref\">1<\/a><\/sup>:<\/p>\n<pre><code class=\"language-py\">pl.concat([close_family, extended_family], how=\"vertical_relaxed\")<\/code><\/pre>\n<pre><code>shape: (4, 2)\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 name    \u2506 age \u2502\n\u2502 ---     \u2506 --- \u2502\n\u2502 str     \u2506 i64 \u2502\n\u255e\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2561\n\u2502 John    \u2506 27  \u2502\n\u2502 Anne    \u2506 35  \u2502\n\u2502 Rob     \u2506 47  \u2502\n\u2502 Jessica \u2506 28  \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2518<\/code><\/pre>\n<p>I don't know for sure, but I'm guessing the reason we have <code>how=\"vertical_relaxed\"<\/code> instead of <code>strict=False<\/code> is because the parameter <code>strict<\/code> is completely irrelevant for the other types of concatenation supported by <code>pl.concat<\/code>, so the Polars devs decided to fold that functionality into the parameter <code>how<\/code>.<\/p>\n<div class=\"footnotes\">\n<hr>\n<ol>\n<li id=\"fn:1\">\n<p>I was giving a Polars training and a participant taught me this. You learn a lot when you teach!\u00a0<a href=\"#fnref1:1\" rev=\"footnote\" class=\"footnote-backref\">\u21a9<\/a><\/p>\n<\/li>\n<\/ol>\n<\/div>","summary":"Today I learned that Polars allows non-strict vertical concatenation of dataframes with the parameter `how=&#039;vertical&#039;`.","date_modified":"2025-10-20T22:34:56+02:00","tags":["polars","programming","python"],"image":"\/user\/pages\/02.blog\/04.til\/114.implicit-casting-in-dataframe-concatenation\/thumbnail.webp"},{"title":"TIL #108 \u2013 Date sequences in Polars","date_published":"2024-12-10T17:52:00+01:00","id":"https:\/\/mathspp.com\/blog\/til\/date-sequences-in-polars","url":"https:\/\/mathspp.com\/blog\/til\/date-sequences-in-polars","content_html":"<p>Today I learned how to use the Polars function <code>pl.date_range<\/code> to create date sequences with calendar-aware intervals between dates.<\/p>\n\n<h2 id=\"date-sequences-in-polars\">Date sequences in Polars<a href=\"#date-sequences-in-polars\" class=\"toc-anchor after\" data-anchor-icon=\"#\" aria-label=\"Anchor\"><\/a><\/h2>\n<p>Polars provides a function <code>polars.date_range<\/code> that is able to produce date sequences with calendar-aware intervals.\nFor example, I'm writing this article on the 10th of December.\nIf I say I will publish it \u201cin a month from now\u201d, I'm talking about the 10th of January, which is 31 days away.\nIf today were the 10th of February and I said the same thing, then I'd be saying I would publish this article on the 10th of March, which would be 28 or 29 days away (depending on whether it's a leap year or not).<\/p>\n<p>These \u201ccalendar-aware\u201d intervals are supported by Polars and, in particular, the function <code>polars.date_range<\/code> supports them as well:<\/p>\n<pre><code class=\"language-py\">import datetime as dt\nimport polars as pl\n\nprint(\n    pl.date_range(\n        start=dt.date(2024, 12, 10),\n        end=dt.date(2025, 5, 1),\n        interval=\"1mo\",  # 1 month\n        eager=True,  # Produce the sequence right away.\n    )\n)<\/code><\/pre>\n<pre><code>shape: (5,)\nSeries: 'literal' [date]\n[\n    2024-12-10\n    2025-01-10\n    2025-02-10\n    2025-03-10\n    2025-04-10\n]<\/code><\/pre>\n<p>As you can see, when Polars uses the 1 month intervals, we get a number of dates that fall on the 10th of December, January, etc.<\/p>\n<p>At the time of writing, Polars supports 5 interval specifiers:<\/p>\n<table>\n<thead>\n<tr>\n<th style=\"text-align: left;\">Interval<\/th>\n<th style=\"text-align: left;\">Meaning<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align: left;\"><code>d<\/code><\/td>\n<td style=\"text-align: left;\">Day<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\"><code>w<\/code><\/td>\n<td style=\"text-align: left;\">Week<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\"><code>mo<\/code><\/td>\n<td style=\"text-align: left;\">Month<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\"><code>q<\/code><\/td>\n<td style=\"text-align: left;\">Quarter<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: left;\"><code>y<\/code><\/td>\n<td style=\"text-align: left;\">Year<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>These can be combined and used together.\nFor example, <code>\"1mo2d\"<\/code> means 1 month and 2 days:<\/p>\n<pre><code class=\"language-py\">import datetime as dt\nimport polars as pl\n\nprint(\n    pl.date_range(\n        start=dt.date(2024, 12, 10),\n        end=dt.date(2025, 5, 1),\n        interval=\"1mo2d\",  # 1 month and 2 days.\n        eager=True,\n    )\n)<\/code><\/pre>\n<pre><code>shape: (5,)\nSeries: 'literal' [date]\n[\n    2024-12-10\n    2025-01-12\n    2025-02-14\n    2025-03-16\n    2025-04-18\n]<\/code><\/pre>","summary":"Today I learned how to use the Polars function `pl.date_range` to create date sequences with calendar-aware intervals between dates.","date_modified":"2025-10-20T22:34:56+02:00","tags":["polars","programming","python"],"image":"\/user\/pages\/02.blog\/04.til\/108.date-sequences-in-polars\/thumbnail.webp"}]}
