2

I am struggling to create a nested dictionary with the following data:

Team,       Group,  ID,  Score,  Difficulty
OneTeam,    A,      0,   0.25,   4
TwoTeam,    A,      1,   1,      10
ThreeTeam,  A,      2,   0.64,   5
FourTeam,   A,      3,   0.93,   6
FiveTeam,   B,      4,   0.5,    7
SixTeam,    B,      5,   0.3,    8
SevenTeam,  B,      6,   0.23,   9
EightTeam,  B,      7,   1.2,    4

Once imported as a Pandas Dataframe, I turn each feature into these lists: teams, group, id, score, diff.

Using this stack overflow answer Create a complex dictionary using multiple lists I can create the following dictionary:

{'EightTeam': {'diff': 4, 'id': 7, 'score': 1.2},
 'FiveTeam': {'diff': 7, 'id': 4, 'score': 0.5},
 'FourTeam': {'diff': 6, 'id': 3, 'score': 0.93},
 'OneTeam': {'diff': 4, 'id': 0, 'score': 0.25},
 'SevenTeam': {'diff': 9, 'id': 6, 'score': 0.23},
 'SixTeam': {'diff': 8, 'id': 5, 'score': 0.3},
 'ThreeTeam': {'diff': 5, 'id': 2, 'score': 0.64},
 'TwoTeam': {'diff': 10, 'id': 1, 'score': 1.0}}

using the code:

{team: {'id': i, 'score': s, 'diff': d} for team, i, s, d in zip(teams, id, score, diff)}

But what I'm after is having 'Group' as the main key, then team, and then id, score and difficulty within the team (as above).

I have tried:

{g: {team: {'id': i, 'score': s, 'diff': d}} for g, team, i, s, d in zip(group, teams, id, score, diff)}

but this doesn't work and results in only one team per group within the dictionary:

{'A': {'FourTeam': {'diff': 6, 'id': 3, 'score': 0.93}},
 'B': {'EightTeam': {'diff': 4, 'id': 7, 'score': 1.2}}}

Below is how the dictionary should look, but I'm not sure how to get there - any help would be much appreciated!

{'A:': {'EightTeam': {'diff': 4, 'id': 7, 'score': 1.2},
  'FiveTeam': {'diff': 7, 'id': 4, 'score': 0.5},
  'FourTeam': {'diff': 6, 'id': 3, 'score': 0.93},
  'OneTeam': {'diff': 4, 'id': 0, 'score': 0.25}},
 'B': {'SevenTeam': {'diff': 9, 'id': 6, 'score': 0.23},
  'SixTeam': {'diff': 8, 'id': 5, 'score': 0.3},
  'ThreeTeam': {'diff': 5, 'id': 2, 'score': 0.64},
  'TwoTeam': {'diff': 10, 'id': 1, 'score': 1.0}}}
1
  • 1
    Can you post your original data in text not image? Commented May 24, 2019 at 14:22

2 Answers 2

3

A dict comprehension may not be the best way of solving this if your data is stored in a table like this.

Try something like

from collections import defaultdict
groups = defaultdict(dict)
for g, team, i, s, d in zip(group, teams, id, score, diff):
    groups[g][team] = {'id': i, 'score': s, 'diff': d }

By using defaultdict, if groups[g] already exists, the new team is added as a key, if it doesn't, an empty dict is automatically created that the new team is then inserted into.

Edit: you edited your answer to say that your data is in a pandas dataframe. You can definitely skip the steps of turning the columns into list. Instead you could then for example do:

from collections import defaultdict
groups = defaultdict(dict)
for row in df.itertuples():
    groups[row.Group][row.Team] = {'id': row.ID, 'score': row.Score, 'diff': row.Difficulty} 
Sign up to request clarification or add additional context in comments.

2 Comments

i convert the features into lists (which i should have mentioned earlier - now edited) - would that mean dict comprehension could be used?
As bracco23 pointed out, you can use a dict comprehension. The question is would you want to and the answer is probably no because "readability counts". I'll add to my answer how you might do it directly from the dataframe.
2

If you absolutely want to use comprehension, then this should work:

z = zip(teams, group, id, score, diff)
s = set(group)
d = { #outer dict, one entry for each different group
    group: ({ #inner dict, one entry for team, filtered for group
        team: {'id': i, 'score': s, 'diff': d} 
        for team, g, i, s, d in z
        if g == group
        }) 
    for group in s 
    }

I added linebreaks for clarity

EDIT:

After the comment, to better clarify my intention and out of curiosity, I run a comparison:

# your code goes here

from collections import defaultdict
import timeit

teams = ['OneTeam', 'TwoTeam', 'ThreeTeam', 'FourTeam', 'FiveTeam', 'SixTeam', 'SevenTeam', 'EightTeam']
group = ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B']
id = [0, 1, 2, 3, 4, 5, 6, 7]
score = [0.25, 1, 0.64, 0.93, 0.5, 0.3, 0.23, 1.2] 
diff = [4, 10, 5, 6, 7, 8, 9, 4]

def no_comprehension():
    global group, teams, id, score, diff
    groups = defaultdict(dict)
    for g, team, i, s, d in zip(group, teams, id, score, diff):
        groups[g][team] = {'id': i, 'score': s, 'diff': d }

def comprehension():
    global group, teams, id, score, diff
    z = zip(teams, group, id, score, diff)
    s = set(group)
    d = {group: ({team: {'id': i, 'score': s, 'diff': d} for team, g, i, s, d in z if g == group}) for group in s}

print("no comprehension:")
print(timeit.timeit(lambda : no_comprehension(), number=10000))
print("comprehension:")
print(timeit.timeit(lambda : comprehension(), number=10000))

executable version

Output:

no comprehension:
0.027287796139717102
comprehension:
0.028979241847991943

They do look the same, in terms of performance. With my sentence above, I was just highlighting this as an alternative solution to the one already posted by @JohnO.

1 Comment

Do you think this would be the fastest approach? I get the impression from you saying 'if you absolutely want to use' that it may not be the best method. @bracco23

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.