My top 25 pandas tricks



AI Summary

Summary of Top 25 pandas Tricks Video

  1. Show Installed Versions
    • Check pandas version with pd.__version__.
    • Use show_versions() to see versions of dependencies.
  2. Create an Example DataFrame
    • Use a dictionary to construct a DataFrame.
    • For larger DataFrames, utilize np.random.rand() and coerce a string of letters to a list for non-numeric column names.
  3. Rename Columns
    • Use rename() method with a dictionary for flexible renaming.
    • Overwrite columns attribute to rename all columns.
    • Use str.replace() to replace characters in column names.
    • Add prefix or suffix with add_prefix() or add_suffix().
  4. Reverse Row Order
    • Reverse rows with loc[::-1].
    • Reset index after reversing with reset_index(drop=True).
  5. Reverse Column Order
    • Reverse columns with loc[:, ::-1].
  6. Select Columns by Data Type
    • Use select_dtypes() to filter columns by data type.
  7. Convert Strings to Numbers
    • Convert data types with astype().
    • Handle invalid input with pd.to_numeric() and errors='coerce'.
  8. Reduce DataFrame Size
    • Use usecols parameter to read only needed columns.
    • Convert object columns with categorical data to category data type.
  9. Build a DataFrame from Multiple Files (Row-wise)
    • Use glob module to concatenate multiple files into one DataFrame row-wise.
  10. Build a DataFrame from Multiple Files (Column-wise)
    • Use glob and concat to combine files column-wise.
  11. Create a DataFrame from the Clipboard
    • Use read_clipboard() to read data copied to the clipboard.
  12. Split a DataFrame into Two Random Subsets
    • Use sample() and drop() to split a DataFrame.
  13. Filter a DataFrame by Multiple Categories
    • Use isin() to filter by multiple categories.
  14. Filter a DataFrame by Largest Categories
    • Use value_counts() and nlargest() to filter by largest categories.
  15. Handle Missing Values
    • Use isna() and dropna() to handle missing data.
  16. Split a String into Multiple Columns
    • Use str.split() and expand=True to split strings into separate columns.
  17. Expand a Series of Lists into a DataFrame
    • Use apply() with pd.Series to expand lists into a DataFrame.
  18. Aggregate by Multiple Functions
    • Use groupby() and agg() to aggregate by multiple functions.
  19. Combine the Output of an Aggregation with a DataFrame
    • Use transform() to add an aggregation result as a new column.
  20. Select a Slice of Rows and Columns
    • Use loc to select a subset of rows and columns.
  21. Reshape a MultiIndexed Series
    • Use unstack() to convert a MultiIndexed Series into a DataFrame.
  22. Create a Pivot Table
    • Use pivot_table() to create pivot tables and add totals with margins=True.
  23. Convert Continuous Data into Categorical Data
    • Use cut() to bin continuous data into categories.
  24. Change Display Options
    • Use set_option() to change display precision.
  25. Style a DataFrame
    • Use style.format() and chaining methods to style DataFrames.

Bonus Trick: Profile a DataFrame

  • Use pandas-profiling to generate an interactive HTML report of a DataFrame.