How to delete small portion of data from BIG table?

You have a big table. The biggest in your system.

You may say big table, big fun but also in some situation a big challenge.

The manager gives you a task:

  • Delete a small portion of data from this table. Only about 1% of rows need to be removed.
  • How would you approach this task?

    What query will you build? How would you minimize logical reads? Would you approach this task differently if it was a one time activity or task executed on a regular basis?

    Consider: is this table used exclusively used by you? Maybe in parallel, some other process execute inserts into this table?

    Continue reading to see how to delete data in batch on SQL Server.

    Read More

    Pandas can do THIS? Data Engineer perspective on pandas

    We all work with data.

    Amount of data is growing fast, in the business setup or daily life. There is a need to extract them from different places, marge it, filter and send it to someone.

    And do it AS FAST AS POSSIBLE.

    Probably you also have a lot of data to be analyzed.

    Most likely, you don’t like to repeat this operations over and over again. Doing everything manually is a tedious task.

    Python and pandas might be the tools that you need.

    Pandas gives you possibility to:
    – read it from heterogeneous data sources: (CSV, Excel, Database, Parquet etc)
    – analyze the data,
    – operate on a data,
    – manipulate the data,
    – supplement it with another data,
    – filter and sort.

    After you are done with your operations, pandas gives you a possibility to store it in your favorite format: Excel, CSV, Parquet. Whatever you like.

    Read More

    7 tips to automate your daily DWH/BI developer life using PowerShell and Excel

    Connect to Excel using PowerShell

    PowerShell is a powerfull tool that will make your life easier. You can use it to automate your daily work or make boring taks interesting. It can also save you time to do whatherever you like.

    If you are working on a Windows machine you already have it installed. This is an additional benefit.

    As a bonus please find a git scripts at the end of the article. There are two working programs that are doing all the juicy stuff.

    In this blog post you will see how can you:
    – Connect to Excel using PowerShell
    – Get a sheet name
    – Find a named table
    – Display value from the cell
    – Loop through table
    – Execute Excel Macro from PowerShell
    – And finally save an Excel file using PowerShell

    Read More