How to create groups of elements and choose the largest value?

Question

Home

How to create groups of elements and choose the largest value?

0

Hi stackoverflow Community!

I have the set of data:

0 A 0.000027769231 1 B 0.000030287440 0.628306 0.988151 1 0 A 0.000027479497 2 C 0.000035937793 0.581428 0.976041 1 1 B 0.000030287440 2 C 0.000035532483 0.516033 0.987388 1 4 D 0.000011085990 5 E 0.000008163211 0.577556 0.943583 1 4 D 0.000010787916 8 F 0.000008873166 0.531686 0.954017 1 5 E 0.000007865264 8 F 0.000008873166 0.691516 0.989945 1 311 G 0.000006216949 312 H 0.000002510852 0.829361 0.983148 1 326 M 0.000028129783 327 N 0.000011022112 0.843188 0.915627 1 326 M 0.000027462953 328 O 0.000002167529 1.742349 0.943267 1 326 M 0.000028024026 329 P 0.000005130416 1.263187 0.924010 1 326 M 0.000027630314 330 R 0.000002965539 1.668906 0.935518 1 326 M 0.000027721668 331 S 0.000002614498 1.851544 0.939051 1 326 M 0.000028129332 332 T 0.000003145471 1.742525 0.930186 1 327 N 0.000011020065 328 O 0.000002570277 2.473902 0.943474 1 327 N 0.000011028065 329 P 0.000005235456 1.447848 0.976569 1 327 N 0.000011032158 330 R 0.000003154471 2.303768 0.955479 1 327 N 0.000011025788 331 S 0.000002864823 2.038783 0.946972 1 327 N 0.000011064135 332 T 0.000003183160 1.213611 0.975056 1 328 O 0.000002505234 329 P 0.000005129224 1.549313 0.968629 1 328 O 0.000002452331 330 R 0.000002965465 2.328536 0.981076 1 329 P 0.000005147180 330 R 0.000003095314 2.803627 0.977268 1 329 P 0.000005208069 332 T 0.000003147536 2.658807 0.984912 1 330 R 0.000002967887 331 S 0.000002700052 1.208673 0.987825 1 330 R 0.000003110114 332 T 0.000003145140 2.428988 0.983747 1 331 S 0.000002853757 332 T 0.000003145464 1.551457 0.982276 1 366 I 0.000000326315 367 J 0.000000253986 1.410176 0.961879 1 366 I 0.000000327483 368 K 0.000000110327 1.236265 0.918510 1 366 I 0.000000326939 369 Q 0.000000165208 2.258098 0.907039 1 367 J 0.000000257330 368 K 0.000000113511 2.600934 0.907874 1 367 J 0.000000256872 369 Q 0.000000166861 1.102368 0.937099 1

In each row I have an unique pair of some elements that I indicated here as a letters. I want to create groups of these elements and choose the largest value from column 3 or 6 in each group. For this dataset I should get 4 groups with elements and max value from column 3 or 6:

A B C maxval: C: 0.000035937793

D E F maxval: D: 0.000011085990

G H maxval: G: 0.000006216949

M N O P R S T maxval: M: 0.000028129783

I J K Q maxval: I: 0.000000326939

As you can notice, if in rows there are more than one the same element (e.g. A), values in column 3 (for A) are a little bit different. However, we can make an assumption that A has the same value of column 3 in every cases.

As an output I want to get three files:

list of groups with maxval of column 3 or 6
list of elements with the largest value from column 3 or 6. I want also add 1 or 4 column for every elements:

2 C 4 D 311 G 326 M 366 I

list with other elements from every groups:

0 A 1 B 5 E 8 F 312 H 327 N 328 O 329 P 330 R 331 S 332 T 367 J  368 K 369 Q

I have no idea how to do such a case in Python. Can anyone help me with some advices or parts of code?

Aldenjeffreylourdes Asked on July 16, 2020 in Algorithms, Python.

Share
Comment(0)

Add Comment

1 Answer(s)

Votes
Oldest

0

I am not sure if I exactly answer what you want, some parts are unclear to me, but probably small adjustments can be easily made within the loop.

With help of pandas and numpy,

import pandas as pd import numpy as np

We can load the data

data = pd.read_csv("data.txt", sep=" ", header=None)

And define a function

# https://stackoverflow.com/questions/39915402/combine-a-list-of-pairs-tuples def make_equiv_classes(pairs):     groups = {}     for (x, y) in pairs:         xset = groups.get(x, set([x]))         yset = groups.get(y, set([y]))         jset = xset | yset         for z in jset:             groups[z] = jset     return set(map(tuple, groups.values()))

And create our classes

classes = make_equiv_classes( data.values[:,[1,4]] )

Then for each class

for cls in classes:     max_cls = 0     print(sorted(cls))      sub_class = data.loc[data[1].isin(cls) | data[4].isin(cls)]     max_class_value = np.max( sub_class.values[:,[2,5]] )          subclass_argmax = np.argmax( np.max( sub_class.values[:,[2,5]], axis=1) )     data_argmax = sub_class.iloc[subclass_argmax][0]          first_letter = sub_class.iloc[subclass_argmax][1]     second_letter = sub_class.iloc[subclass_argmax][4]      print( "Max Class Value: {}".format(max_class_value))     print( "Max Class Number: {}".format(data_argmax))     print( "First letter: {}, Second Letter: {}".format(first_letter, second_letter))     print( "\n")

it will print

['M', 'N', 'O', 'P', 'R', 'S', 'T'] Max Class Value: 2.8129783000000003e-05 Max Class Number: 326 First letter: M, Second Letter: N   ['G', 'H'] Max Class Value: 6.216949e-06 Max Class Number: 311 First letter: G, Second Letter: H   ['D', 'E', 'F'] Max Class Value: 1.108599e-05 Max Class Number: 4 First letter: D, Second Letter: E   ['I', 'J', 'K', 'Q'] Max Class Value: 3.27483e-07 Max Class Number: 366 First letter: I, Second Letter: K   ['A', 'B', 'C'] Max Class Value: 3.5937793e-05 Max Class Number: 0 First letter: A, Second Letter: C

Earlemarialice Answered on July 16, 2020.

Share
Comment(0)

Add Comment

Your Answer

Answer 1

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 2

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 3

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 4

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 5

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 6

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 7

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 8

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

Answer 9

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 10

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 11

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 12

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 13

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 14

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 15

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 16

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

LATEST ANSWERS

How to create groups of elements and choose the largest value?

Your Answer

TOP USERS

HOT QUESTIONS

LATEST ANSWERS

How to create groups of elements and choose the largest value?

Your Answer

Tags Widget

TOP USERS

HOT QUESTIONS