Postgres from MSSQL - Notes from converting database from MSSQL to PostgreSQL

Post date: Jan 20, 2014 12:54:19 PM

Migration of schema

Script out Schemas, Tables, indexes and views. Stored procedures to be handled separately.

  1. From management studio, right click on the database to migrate, click tasks and then choose "Generate scripts..."
  2. Select Tables, Schemas and Views, selectively remove the objects that are not interesting to migrate (to save time) ,click next
  3. Click the "Advanced" button
  4. Set the following to False
      1. Script Collation
      2. Script Extended Properties
      3. Include Descriptive Headers
      4. Script data compression options
    1. Click OK to Close the advanced setting window
    2. Save the query to new Query Window

For ease of conversion I created one file per group of objects, ie Schemas.sql, Tables.Sql, Indexes.SQL, Constraints.Sql.

I did this by creating those files and copying the relevant scripts to those files. The generated file from management studio are sorted so that the splitting is easy.

By doing one script one at a time it was easy to spot the errors and progressively moving towards a fully converted database.

Note! the notes here are not complete and depending on what features you used in SQL Server you may have more manual work than me. Also, the regexes in my notes may need to be adjusted to fit your naming conventions, but they should give you a hit of what to do.

Now clean the scripts.

With Replace I mean to do Find->Replace Y with X,

With Remove i mean to do Find->Replace Y with "" (empty string)

  • For all objects:
    • remove brackets [ and ]
    • replace GO with ;
      • Remove SET operations.Tip: Find using Wildcards SET *
      • Proposed by Postgres migration documentation all script text should be made to lower case. I did not follow this rule. Hope it will not bite me later.
  • Schemas
    • Nothing special
  • Tables:
    • replace datetime2(x) with timestamp
    • replace nvarchar(x) with varchar(x)
    • replace varchar(max) with text
    • replace uniqueidentifyer with UUID
    • remove WITH (options) Tip: Find using Wildcards WITH (*
    • remove the ASC (for the primary key constraints)
    • remove the CLUSTERED and NONCLUSTERED keywords
  • Indexes:
    • INCLUDED columns in indexes are not supported, they need to be removed. If your index is unique then remove them, otherwise consider adding the included columns to the index.
    • Constraints
      • replace default date value specification, using Notepad++ Regex replace
        • find: ADD ADD CONSTRAINT[\s]*([\w]*)[\s]*DEFAULT \(getdate\(\)\) FOR (.*)
        • replacewith: ALTER COLUMN $2 SET DEFAULT current_timestamp
    • replace default integer values using notepad++
      • find: ADD CONSTRAINT[\s]*([\w]*)[\s]*DEFAULT \(\((\d*)\)\) FOR (.*)
      • replace with: ALTER COLUMN $3 SET DEFAULT $2
    • remove WITH CHECK
      • The foreign keys will have a CHECK CONSTRAINT clause in order to make them trusted in sql Server. These rows need to be removed.
        • find: (.*?)CHECK CONSTRAINT(.*)
        • replace with empty string
      • Bit columns need to have their default values casted to bit: (cast 1 as bit)

Completing these steps took less than an hour, including writing the regular expressions, for a database with about 50 tables. Stored procedures to be done later.

Findings after the migration:

    1. Naming
    2. The naming convention used in Sql Server was using CamelCase, in postgres all objects have lowercase names, making some of the names hard to read. The common naming convention in Postgres is to use lower_case with underscore as separator.
    3. Included columns in indexes
    • Included columns in indexes was a way of reducing the key size of the indexes in SQL Server. The combination of Filtered indexes and included columns made it possible to create a covering index with a unique constraint. This cannot be done in Postgres as far as I know.
    1. Tinyint is not supported in Postgres, smallest integer datatype is 2 bytes. Tinyint in SqlServer is 1 byte.
    2. Datacompression is applied to individual data value, not page or row. Suggested alternative is to use a compressed file system. The value compression is applied automatically.
    3. In SQL Server, two types of indexes are available. In Postgres we have many more. Each index should be evaluated to find the best type of index in postgres.
      1. Do note that you can have CLUSTERED index in postgres by using the CLUSTER clause (CLUSTER table_name USING index_name). This will sort the data according to the index. It will however not sort new data that will be inserted. To sort any new data the CLUSTER clause need to be repeated, it is not automatic for each insert as in SQL Server.
    4. Postgres limits the length of the object names to 64(63?) characters. In sql Server it is limited to 128.