Manipulating Python’s Built In Data Structures
Having been a constant user of python, the available data structures have been very helpful throughout my work. Here are some useful snippets that I try to keep in mind!
Lists
1. Getting unique ordered values from a list (by using dictionary!)
Use case: Removing duplicates but Insertion order matters!
- May not be the most efficient, but way cleaner
- Why not use a set? Because order is lost in a set!
list_with_dupes = [1, 1, 2, 3, 3, 3]
list_without_dupes = list(dict.fromkeys(list_with_dupes))
## if insertion order did not matter (or for python 3.5 or lower iirc)
list_without_dupes = list(set(list_with_dupes))
2. Flattening a list of lists
Use case: flattening a small list of lists without other packages
- numpy or itertools should have other methods (see here).
### list comprehension
flat_list = [item for sublist in l for item in sublist]
### what the list comprehension means
flat_list = []
for sublist in l:
for item in sublist:
flat_list.append(item)
### simple function
def flatten(l):
return [item for sublist in l for item in sublist]
3. Checking if at least 1 element in list A exists in List B
Use case: you have an excluded word list. Any list you are searching that has at least one element from here needs to be excluded. (Source)
for sentence in sentences:
split_sentence = sentence.split()
if any(word in bad_words_list for word in split_sentence):
print("bad word found!")
Dictionary
1. Getting a value without key errors
Use case: you have dictionaries that may/may not have the keys you want. Instead of using if-else to check. You can return a default value instead (if that is what you have in mind)
# you have my_list, a list of dictionaries
values_you_want = []
for my_dict in my_list_of_dicts:
val = my_dict.get("key_you_need", None)
values_you_want.append(val)
### older method
values_you_want = []
for my_dict in my_list_of_dicts:
if "key_you_need" in my_dict:
val = my_dict["key_you_need"]
values_you_want.append(val)
else:
values_you_want.append(None)
2. Mapping dictionary to a dataframe column
Use case: you have a dataframe column where each element in this column is a key in your dictionary. You want to assign the dictionary values to another column in your dataframe.
my_df['values'] = my_df.key_col.map(my_dict)
Tuples
1. Accessing one index of a tuple
Use case: you have a list of tuples. You want to assign all values of a to a dataframe column “A”. Same thing for all b values.
##eg. list of tuples: [(a1, b1), (a2, b2)]
my_df['A'] = list_of_tuples.map(lambda x: x[0])