How to turn list into data frame after list comprehension

Question

Home

How to turn list into data frame after list comprehension

0

pandas

I have the following code and I was wondering how do I properly turn it into a data frame with country as one column and population as the other after looping through my function with list comprehension?

from bs4 import BeautifulSoup import html from urllib.request import urlopen import pandas as pd  countries = ['af', 'ax']  def get_data(countries):     url = 'https://www.cia.gov/library/publications/the-world-factbook/geos/'+countries+'.html'     page = urlopen(url)     soup = BeautifulSoup(page,'html.parser')     # geography     country = soup.find('span', {'class' : 'region'}).text     population = soup.find('div', {'id' : 'field-population'}).find_next('span').get_text(strip=True)     dataframe = [country, population]     dataframe = pd.DataFrame([dataframe])     return dataframe results = [get_data(p) for p in countries]

What I tried and it gives me the following data frame:

results = pd.DataFrame(results)                                        0                                       1 0   0 Afghanistan Name: 0, dtype: object    0 Afghanistan Name: 0, dtype: object 1   0 Akrotiri Name: 0, dtype: object       0 Akrotiri Name: 0, dtype: object

Vonhobertdoreen Asked on July 16, 2020 in Python.

Share
Comment(0)

Add Comment

3 Answer(s)

Votes
Oldest

0

I’m not quite sure why you’re returning it as a DataFrame from get_data(). If you return it as a dictionary, it will be much more logical for conversion to a dataframe later.

countries = ['af', 'ax']  def get_data(countries):     url = 'https://www.cia.gov/library/publications/the-world-factbook/geos/'+countries+'.html'     page = urlopen(url)     soup = BeautifulSoup(page,'html.parser')     # geography     country = soup.find('span', {'class' : 'region'}).text     population = soup.find('div', {'id' : 'field-population'}).find_next('span').get_text(strip=True)     scraped = {'country':country, 'population':population}      return scraped results = [get_data(p) for p in countries]

This returns a list of dictionaries such as:

[{'country': 'Afghanistan', 'population': '36,643,815'},  {'country': 'Akrotiri',   'population': 'approximately 15,500 on the Sovereign Base Areas of Akrotiri and Dhekelia including 9,700 Cypriots and 5,800 Service and UK-based contract personnel and dependents'}]

So when you convert with pd.DataFrame(results) you get:

       country                                         population 0  Afghanistan                                         36,643,815 1     Akrotiri  approximately 15,500 on the Sovereign Base Are...

Ernietinamonica Answered on July 16, 2020.

Share
Comment(0)

Add Comment

0

In [136]: from bs4 import BeautifulSoup      ...: import html      ...: from urllib.request import urlopen      ...: import pandas as pd      ...:      ...: countries = ['af', 'ax']      ...:      ...: def get_data(countries):      ...:     url = 'https://www.cia.gov/library/publications/the-world-factbook/geos/'+countries+'.html'      ...:     page = urlopen(url)      ...:     soup = BeautifulSoup(page,'html.parser')      ...:     # geography      ...:     country = soup.find('span', {'class' : 'region'}).text      ...:     population = soup.find('div', {'id' : 'field-population'}).find_next('span').get_text(strip=True)      ...:     json_str = {"country":country, "population":population}      ...:     return json_str      ...: results = [get_data(p) for p in countries]      ...: df = pd.DataFrame(results)  In [137]: df Out[137]:        country                                         population 0  Afghanistan                                         36,643,815 1     Akrotiri  approximately 15,500 on the Sovereign Base Are...

Jerryshannaella Answered on July 16, 2020.

Share
Comment(0)

Add Comment

0

If you rewrite your original function as:

def get_data(countries):     url = 'https://www.cia.gov/library/publications/the-world-factbook/geos/'+countries+'.html'     page = urlopen(url)     soup = BeautifulSoup(page,'html.parser')     # geography     country = soup.find('span', {'class' : 'region'}).text     population = soup.find('div', {'id' : 'field-population'}).find_next('span').get_text(strip=True)     return country, population

and call

results = [get_data(p) for p in countries]

as you suggested, you can do something like this:

def listToFrame(res, column_labels=None):     C = len(res[0]) # number of columns     if column_labels is None:         column_labels = list(range(C))     dct = {}     for c in range(C):         col = []         for r in range(len(res)):             col.append(res[r][c])         dct[column_labels[c]] = col     return pd.DataFrame(dct)  df = listToFrame(results)

or, even nicer,

df = listToFrame(results, ['Country', 'Population'])

Mohamedlucianofrankie Answered on July 16, 2020.

Share
Comment(0)

Add Comment

Your Answer

Answer 1

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 2

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 3

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 4

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 5

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 6

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 7

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 8

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

Answer 9

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 10

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 11

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 12

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 13

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 14

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 15

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 16

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

LATEST ANSWERS

How to turn list into data frame after list comprehension

Your Answer

TOP USERS

HOT QUESTIONS

LATEST ANSWERS

How to turn list into data frame after list comprehension

Your Answer

Tags Widget

TOP USERS

HOT QUESTIONS