برچسب: Pandas

  • Python – Learn Pandas with SQL Examples – Football Analytics Example – Useful code

    Python – Learn Pandas with SQL Examples – Football Analytics Example – Useful code


    When working with data, you will often move between SQL databases and Pandas DataFrames. SQL is excellent for storing and retrieving data, while Pandas is ideal for analysis inside Python.

    In this article, we show how both can be used together, using a football (soccer) mini-league dataset. We build a small SQLite database in memory, read the data into Pandas, and then solve real analytics questions.

    There are neither pythons or pandas in Bulgaria. Just software.

    • Setup – SQLite and Pandas

    We start by importing the libraries and creating three tables –
    [teams, players, matches]  inside an SQLite in-memory database.

    Now, we have three tables.

    • Loading SQL Data into Pandas


    pd.read_sql  does the magic to load either a table or a custom query directly.

    At this point, the SQL data is ready for analysis with Pandas.

    • SQL vs Pandas – Filtering Rows

    Task: Find forwards (FW) with more than 1200 minutes on the field:

    SQL:

    Pandas:

    As expected, both return the same subset, one written in SQL and the other in Pandas.

    Task: Total goals per team:

    SQL:

    Pandas:

    Both results show which team has scored more goals overall.

    Task: Add the city of each team to the players table.

    SQL:

    Pandas:

    The fun part: calculating points (3 for a win, 1 for a draw) and goal difference. Only with SQL this time.

    This produces a proper football league ranking – teams sorted by points and then goal difference:

    • Quick Pandas Tricks

      • Top scorers with
        nlargest:

    https://www.youtube.com/watch?v=U0lbBaHFAEM

    https://github.com/Vitosh/Python_personal/tree/master/YouTube/041_Python-Learn-Pandas-with-Football-Analytics



    Source link

  • Python – Data Wrangling with Excel and Pandas – Useful code

    Python – Data Wrangling with Excel and Pandas – Useful code


    Data wrangling with Excel and Pandas is actually quite useful tool in the belt of any Excel professional, financial professional, data analyst or a developer. Really, everyonecan benefit from the well defined libraries that ease people’s lifes. These are the libraries used:

    Additionally, a function for making a unique Excel name is used:

    An example of the video, where Jupyter Notebook is used.

    In the YT video below, the following 8 points are discussed:

    # Trick 1 – Simple reading of worksheet from Excel workbook

    # Trick 2 – Combine Reports

    # Trick 3 – Fix Missing Values

    # Trick 4 – Formatting the exported Excel file

    # Trick 5 – Merging Excel Files

    # Trick 6 – Smart Filtering

    # Trick 7 – Mergining Tables

    # Trick 8 – Export Dataframe to Excel

    The whole code with the Excel files is available in GitHub here.

    https://www.youtube.com/watch?v=SXXc4WySZS4

    Enjoy it!



    Source link

  • Pandas v Psycopg. A Postgres database speed test. Who… | by Thomas Reid


    Two racing cars in a race, one represents Pandas, the other Psycopg2
    Image by Author

    Following on from a story I wrote comparing the speed of Pandas and Polars libraries in terms of reading and writing data — from and to — a Postgres database I thought it might be interesting to do a similar comparison between Pandas and Psycopg2.

    If you need to get data from or to a Postgres database table from or to a local file, read on for the winner.

    You can find the Pandas v Polars article at the link below:

    Pandas

    I don’t think I need to explain much about what Pandas is. Its use in Python code is ubiquitous and is one of the main tools that people use to load, explore, visualise and process large amounts of data in Python.

    Psycopg

    Psycopg is one of the most popular PostgreSQL database libraries for the Python programming language. It implements the Python Database API Specification v2.0, allowing Python applications to communicate with PostgreSQL databases.



    Source link