RE : Boolean indexer script works without Error but doesn't work
The first code works, the second code block gives no Error but doesn’t give the result I expected.
First code creates a new column [‘Type’]. Names of equal stores but with different names are binned in column[‘Type’]. So: shop name A and Shop name B, are in column [‘Naam’]. The script labels both as ‘Supermarket’in column [‘Type’]. So far so good.
The second block of code is supposed to lable every store / shop etc. that is not named in the Namendict.test dictionary. I want these not recognised shop / stores etc. labeld as [‘Diversen’]. Hope someone has a suggestion. Thanks!
1: working code:
from Namendict import test for value in df['Naam']: for i, (k,v) in enumerate(test.items()): boolean_indexer = df['Naam'].str.contains(k) df.loc[boolean_indexer, 'Type'] = (v)
2: supposed to work code ( no Error, but also no Diversen in column [‘Type’], just NaN):
from Namendict import test for value in df['Naam']: for i, (k,v) in enumerate(test.items()): boolean_indexer = df['Naam'].str.contains(k) if True: df.loc[boolean_indexer, 'Type'] = (v) else: df.loc[boolean_indexer, 'Type'] = ('Diversen.')
Many thanks. Janneman
There are multiple options to tackle this problem. First option is just to replace the ‘NaN’ values afterwards with ‘Diverse’ with the fillna
function of pandas. This looks as follows:
from Namendict import test # Looping over all existing records in the dict for k,v in test.items(): boolean_indexer = df['Naam'].str.contains(k) df.loc[boolean_indexer, 'Type'] = v # Filling in all empty ("nan") values with "Diversen." df['Type'] = df['Type'].fillna("Diversen.")
Another option is to check if the name exists in the ‘test’ dictionary. If so, the ‘type’ stored in the dictionary can be put in the DataFrame. This loops over all unique names in the column instead over all the values. This makes sure you don’t execute multiple times the same action.
from Namendict import test for naam in df['Naam'].unique(): # Loop over all unique names in DataFrame boolean_indexer = df['Naam'].str.contains(naam) if naam in test.keys(): # Check if the name allready excist in dict # If True --> get type from the dictionary df.loc[boolean_indexer, 'Type'] = test[naam] else: # If False --> fill in 'Diversen.' df.loc[boolean_indexer, 'Type'] = "Diversen."