Introduction PostgreSQL is one of the largest open source SQL database engines. In this article I'm going to do a quick run through on installing and interacting with it. After installation we will be using command line to install a sample database and then explore some essential SQL Shell (psql) commands.This post follows installing PostgreSQL … Continue reading Getting started with PostgreSQL
DAX: The difference between SUM and SUMX
Introduction Understanding the difference between your standard aggregate functions and your X aggregate functions is useful because there will be scenarios for using one over the other. How does SUM() work Sum, average, min, max etc are all fairly easy to understand. They accept a single column input and aggregate the data. Here is the … Continue reading DAX: The difference between SUM and SUMX
Fast Load Data To SQL Server Using Python
Introduction There's plenty of good online material on how to insert data to SQL Server using python but most of it is aimed at OLTP type work loads. Essentially using the python modules pyodbc or sqlalchemy you can insert data from a CSV file to a table but you'll notice that each row is inserted … Continue reading Fast Load Data To SQL Server Using Python
Strategies for Staging Data to SQL
Introduction Staging data is an important part of the ETL (or ELT) process. The general concept is this: before you bring your data into your database, data store, data mart or data warehouse, you need to pre-load it. This is the process of extracting and then loading before moving it on, otherwise known as “staging”. … Continue reading Strategies for Staging Data to SQL
Limit and Offsetting Rows in SQL Server
Introduction In SQL Server you might know you can select the first N rows of a query by using TOP(N), but you can also do the same and skip rows using OFFSET FETCH. Skipping Rows To skip rows you can use the OFFSET <N> ROWS syntax like so... SELECT OrderId ,CustomerID ,OrderDate FROM [Sales].[Orders] ORDER … Continue reading Limit and Offsetting Rows in SQL Server
SQL Server and Docker is easy as 1, 2, 3
Introduction In this short article I'm going to show you how to run and connect to a SQL Server instance running on docker. If you've never used Docker before then I highly recommend you give it a try for your next development project. I find docker easy to set up and work with. It removes … Continue reading SQL Server and Docker is easy as 1, 2, 3
Connect to SQL Server using Python
In this post I'm going to demonstrate how you can connect to SQL Server in python, read data from a table and load it into a data frame. Prerequisites In order to connect to SQL Server you'll need to ensure you have the correct ODBC Driver installed.You'll also want to pip install the Python packages … Continue reading Connect to SQL Server using Python
Import JSON data to SQL Server
In a previous article I demonstrated how you can import XML data using the OPENROWSET() function. In this article I'm going to show how you can do the same but with a JSON file and the OPENJSON() function. Example JSON [ { "id": 1, "product_name": "white paint", "status": "in stock", "orders": [ { "order_id": 1, … Continue reading Import JSON data to SQL Server
Import XML data to SQL Server
In this article I'm going to provide an example of how you can read data from an XML file in SQL Server.To achieve this we're going to use OPENROWSET and OPENXML functions. In this example I'm using the XML sample books which you can obtain from here. Open Row Set Using the open row set … Continue reading Import XML data to SQL Server
Slowly Changing Dimension Types
In this post I'm going to cover the different types of Slowly Changing Dimensions (SCD). What is a Slowly Changing Dimension? In a Kimball structured star schema you use Slowly Changing Dimensions to associate facts with their original value. This is because sometimes associating a fact with it's changed dimensional value will look odd.For example … Continue reading Slowly Changing Dimension Types