Pandas: how to calculate a rolling window over one column (grouped by date) and count distinct values of another column?

Question

Home

Pandas: how to calculate a rolling window over one column (grouped by date) and count distinct values of another column?

0

I am trying to calculate in Pandas a rolling window over one date column and count the distinct values in another column. Let’s say I have this df dataframe:

date    customer 2020-01-01  A 2020-01-02  A 2020-01-02  B 2020-01-03  A 2020-01-03  C 2020-01-03  D 2020-01-04  E

I would like to group by the datecolumn, create a rolling window of two days and count the distinct values in the column customer. The expected output would be something like:

date       distinct_customers 2020-01-01  NaN --> (first value) 2020-01-02  2.0 --> (distinct customers between 2020-01-01 and 2020-01-02: [A, B])  2020-01-03  4.0 --> (distinct customers between 2020-01-02 and 2020-01-03: [A, B, C, D]) 2020-01-04  4.0 --> (distinct customers between 2020-01-03 and 2020-01-04: [A, C, D, E])

It seems easy but I don’t seem to find any straight-forward way to achieve that, I’ve tried using groupby or rolling. I don’t find other posts solving this issue. Does someone have any idea how to do this? Thanks a lot in advance!

Claudeguillermobernadine Asked on July 15, 2020 in Python.

Share
Comment(0)

Add Comment

1 Answer(s)

Votes
Oldest

0

Based on the idea of @Musulmon, this one liner should do it:

pd.crosstab(df['date'], df['customer']).rolling(2).sum().clip(0,1).sum(axis=1)

Thanks!

Jessiemarielsa Answered on July 15, 2020.

Share
Comment(0)

Add Comment

Your Answer

Answer 1

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 2

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 3

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 4

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 5

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 6

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 7

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 8

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

Answer 9

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 10

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 11

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 12

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 13

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 14

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 15

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 16

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

LATEST ANSWERS

Pandas: how to calculate a rolling window over one column (grouped by date) and count distinct values of another column?

Your Answer

TOP USERS

HOT QUESTIONS

LATEST ANSWERS

Pandas: how to calculate a rolling window over one column (grouped by date) and count distinct values of another column?

Your Answer

Tags Widget

TOP USERS

HOT QUESTIONS