Iterating through list and using remove() doesn't produce desired result

I’m a programming neophyte and would like some assistance in understanding why the following algorithm is behaving in a particular manner.

My objective is for the function to read in a text file containing words (can be capitalized), strip the whitespace, split the items into separate lines, convert all capital first characters to lowercase, remove all single characters (e.g., “a”, “b”, “c”, etc.), and add the resulting words to a list. All words are to be a separate item in the list for further processing.

Input file: A text file (‘sample.txt’) contains the following data – “a apple b Banana c cherry”

Desired output: [‘apple’, ‘banana’, ‘cherry’]

In my initial attempt I tried to iterate through the list of words to test if their length was equal to 1. If so, the word was to be removed from the list, with the other words remaining in the list. This resulted in the following, non-desired output: [None, None, None]

filename = ‘sample.txt’  with open(filename) as input_file:     word_list = input_file.read().strip().split(' ')     word_list = [word.lower() for word in word_list]     word_list = [word_list.remove(word) for word in word_list if len(word) == 1]  print(word_list) 

Produced non-desired output = [None, None, None]

My next attempt was to instead iterate through the list for words to test if their length was greater than 1. If so, the word was to be added to the list (leaving the single characters behind). The desired output was achieved using this method.

filename = ‘sample.txt’  with open(filename) as input_file:     word_list = input_file.read().strip().split(' ')     word_list = [word.lower() for word in word_list]     word_list = [word for word in word_list if len(word) > 1]  print(word_list) 

Produced desired Output = [‘apple’, ‘banana’, ‘cherry’]

My questions are:

  1. Why didn’t the initial code produce the desired result when it seemed to be the most logical and most efficient?
  2. What is the best ‘Pythonic’ way to achieve the desired result?
Add Comment
4 Answer(s)

The reason you got the output you got is

  1. You’re removing items from the list as you’re looping through it
  2. You are trying to use the output of list.remove (which just modifies the list and returns None)

Your last list comprehension (word_list = [word_list.remove(word) for word in word_list if len(word) == 1]) is essentially equivalent to this:

new_word_list = [] for word in word_list:     if len(word) == 1:         new_word_list.append(word_list.remove(word)) word_list = new_word_list 

And as you loop through it this happens:

# word_list == ['a', 'apple', 'b', 'banana', 'c', 'cherry'] # new_word_list == []  word = word_list[0]  # word == 'a'  new_word_list.append(word_list.remove(word))  # word_list == ['apple', 'b', 'banana', 'c', 'cherry'] # new_word_list == [None]  word = word_list[1]  # word == 'b'  new_word_list.append(word_list.remove(word))  # word_list == ['apple', 'banana', 'c', 'cherry'] # new_word_list == [None, None]  word = word_list[2]  # word == 'c'  new_word_list.append(word_list.remove(word))  # word_list == ['apple', 'banana', 'cherry'] # new_word_list == [None, None, None]  word_list = new_word_list  # word_list == [None, None, None] 

The best ‘Pythonic’ way to do this (in my opinion) would be:

with open('sample.txt') as input_file:     file_content = input_file.read()  word_list = [] for word in file_content.strip().split(' '):     if len(word) == 1:         continue     word_list.append(word.lower())  print(word_list) 
Answered on July 16, 2020.
Add Comment

In your first approach, you are storing the result of word_list.remove(word) in the list which is None. Bcz list.remove() method return nothing but performing action on a given list.

Your second approach is the pythonic way to achieve your goal.

Add Comment

The second attempt is the most pythonic. The first one can still be achieved with the following:

filename = 'sample.txt'  with open(filename) as input_file:     word_list = input_file.read().strip().split(' ')  word_list = [word.lower() for word in word_list]  for word in word_list:     if len(word) == 1:         word_list.remove(word)  print(word_list) 
Answered on July 16, 2020.
Add Comment

  1. Why didn’t the initial code produce the desired result when it seemed to be the most logical and most efficient?

It’s advised to never alter a list while iterating over it. This is because it is iterating over a view of the initial list and that view will differ from the original.

  1. What is the best ‘Pythonic’ way to achieve the desired result?

Your second attempt. But I’d use a better naming convention and your comprehensions can be combined as you’re only making them lowercase in the first one:

word_list = input_file.read().strip().split(' ') filtered_word_list = [word.lower() for word in word_list if len(word) > 1] 
Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.