Using Pandas to merge similar data

Question

Home

Using Pandas to merge similar data

0

How do I merge similar data such as "recommendation" into one value?

df['Why you choose us'].str.lower().value_counts()  location                           35 recommendation                     23 recommedation                       8 confort                             7 availability                        4 reconmmendation                     3 facilities                          3

Fredshawnayvonne Asked on July 16, 2020 in Python.

Share
Comment(0)

Add Comment

1 Answer(s)

Votes
Oldest

0

print(df)

            reason  count 0         location     35 1   recommendation     23 2    recommedation      8 3          confort      7 4     availability      4 5  reconmmendation      3 6       facilities      3

.groupby(), partial string..transform() while finding the sum

df['groupcount']=df.groupby(df.reason.str[0:4])['count'].transform('sum')              reason  count  groupcount 0         location     35          35 1   recommendation     23          34 2    recommedation      8          34 3          confort      7           7 4     availability      4           4 5  reconmmendation      3          34 6       facilities      3           3

If needed to see string and partial string side by side. Try

df=df.assign(groupname=df.reason.str[0:4]) df['groupcount']=df.groupby(df.reason.str[0:4])['count'].transform('sum') print(df)         reason  count groupname  groupcount 0         location     35      loca          35 1   recommendation     23      reco          34 2    recommedation      8      reco          34 3          confort      7      conf           7 4     availability      4      avai           4 5  reconmmendation      3      reco          34 6       facilities      3      faci           3

Incase you have multiple items in a row like you have in the csv; then

#Read csv df=pd.read_csv(r'path') #Create another column which is a list of values 'Why you choose us' in each row df['Why you choose us']=(df['Why you choose us'].str.lower().fillna('no comment given')).str.split(',') #Explode group to ensure each unique reason is int its own row but with all the otehr attrutes intact df=df.explode('Why you choose us') #remove any white spaces before values in the column group and value_counts df['Why you choose us'].str.strip().value_counts() print(df['Why you choose us'].str.strip().value_counts())  location            48 no comment given    34 recommendation      25 confort              8 facilities           8 recommedation        8 price                7 availability         6 reputation           5 reconmmendation      3 internet             3 ac                   3 breakfast            3 tranquility          2 cleanliness          2 aveilable            1 costumer service     1 pool                 1 comfort              1 search engine        1 Name: group, dtype: int64

Barrettlucianojoan Answered on July 16, 2020.

Share
Comment(0)

Add Comment

Your Answer

Answer 1

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 2

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 3

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 4

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 5

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 6

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 7

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 8

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

Answer 9

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 10

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 11

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 12

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 13

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 14

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 15

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 16

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

LATEST ANSWERS

Using Pandas to merge similar data

Your Answer

TOP USERS

HOT QUESTIONS

LATEST ANSWERS

Using Pandas to merge similar data

Your Answer

Tags Widget

TOP USERS

HOT QUESTIONS