Pandas .loc doesn't work after using a regex?
Here is my data:
player pos avg 0 Antonio Brown WR1 1.0 1 Julio Jones (11) WR2 2.3 2 Odell Beckham Jr. (13) WR3 2.8 3 Todd Gurley (11) RB1 4.8 4 DeAndre Hopkins (9) WR4 5.8 ... ... ... ... 546 Kai Forbath (7) K31 538.0 547 Cody Parkey K32 539.0 548 Wil Lutz (5) K33 542.0 549 Andrew Franks K34 543.0 550 Caleb Sturgis K35 544.0
I used the following regex code to get rid of parenthesis and all characters inside them:
df['player'] = df['player'].str.replace(r"\(.*\)","")
Which got me what I wanted:
player pos adp 0 Antonio Brown WR1 1.0 1 Julio Jones WR2 2.3 2 Odell Beckham Jr. WR3 2.8 3 Todd Gurley RB1 4.8 4 DeAndre Hopkins WR4 5.8 ... ... ... ... 546 Kai Forbath K31 538.0 547 Cody Parkey K32 539.0 548 Wil Lutz K33 542.0 549 Andrew Franks K34 543.0 550 Caleb Sturgis K35 544.0
However now when I use .loc, nothing shows up!
df.loc[(df.player=='Julio Jones')] player pos adp pos_adp season
But when I use .loc on a column that didn’t originally have any parenthesis, it does work:
df.loc[(df.player=='Antonio Brown')] player pos adp pos_adp season 0 Antonio Brown WR1 1.0 1 2016
This is so frustrating, why doesn’t .loc work if I effectively used regex on the columns?
I believe that "Julio Jones (11)"
became "Julio Jones "
and not "Julio Jones"
after the replace, because you got rid of "(11)"
, not " (11)"
. I suggest you use df.player.str.strip()
to get rid of the trailing and leading spaces.