|Breaking down barriers! Performance, file size and multi-lingual data in MapInfo Pro v15.2
||Have you ever had to break a MapInfo table into two because the size exceeded the 2 GB limit? Or, have you had to work with data that included multiple character sets (Unicode data?). If yes, good news is here!
MapInfo Pro v15.2 supports Unicode (UTF-8 and UTF-16 character sets) and offers the ability to create files larger than 2 GB in size. To go along with this performance improvements have been made to make the use of large files more practical.
Some significant barriers have come down with this release.
Creating large tables - the Extended TAB file format
Version v15.2 is the first version of MapInfo Pro that can create a TAB file that is larger than 2 GB. We call this the Extended TAB file format. It is important to note that this format is not used by default as MapInfo Pro v15.2 is the only release that supports Extended Tab files.
Creating an extended format table is easy. You will find this option when saving a copy of a table or creating a new table. Choose MapInfo Extended (*.tab) in the Save as Type drop-down list.
When creating a new table, the Extended TAB file option is also available.
When using the Import capabilities, choose the MapInfo Extended Tab option when the dataset is larger than 2 GB.
Version 15.2 also fully supports Unicode. This allows you to correctly display data in multiple character sets at the same time. This can be mixed data in a single table or different data sets using different character sets, as in the screen shot below.
In this screenshot above, Russian (Cyrillic), Arabic and Japanese data are all being displayed in MapInfo Pro at the same time.
A side note about Importing data into MapInfo TAB or Extended TAB: When v15.2 is released, the included Universal Translator cannot create or read MapInfo Extended TAB files. An updated version of the Universal Translator is scheduled for release in early to mid 2016.
UTF-8 versus UTF-16
Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. UTF-8 and UTF 16 are two of the established standards for encoding. They only differ in how many bytes they use to encode each character. Since both are variable width encoding, they can use up to four bytes to encode the data but when it comes to the minimum, UTF-8 only uses 1 byte (8 bits) and UTF-16 uses 2 bytes (16 bits). This bears a significant impact on the resulting size of the encoded files. When using ASCII only characters, a UTF-16 encoded file would be roughly twice as big as the same file encoded with UTF-8.
If most of the characters in the file are ASCII characters, it is advisable to use UTF-8 encoding. Otherwise it is better to use UTF-16 encoding.
Version 15.2 allows you to save an existing table into a new table with UTF-8 or UTF-16 encoding. You can encounter data corruption, due to truncation or conversion, when saving a copy of a table between Unicode and non-Unicode character sets. When saving non-UTF-8 (non-Unicode) to UTF-8 (Unicode), there is the potential for data truncation.
There is information in the MapInfo Help file on this and we when have placed a reminder in the Save Copy As ...dialog box.
Setting the default behaviour for Extended TAB and Unicode data
Creating Unicode format datasets and tables that exceed the2 GB file size limit are not turned on by default.
If desired, you can change the defaults in the System Settings dialog box. You will find the system preferences (now called Options) in the "Backstage" area (the Pro tab on the ribbon).
- This means you have to expressly choose MapInfo Extended (*.tab) when you want to create a file that will exceed 2 GB in size. The reason for this is because it is only version 15.2 of the software that supports the new Extended TAB file format. You cannot share data in the Extended tab format with earlier versions of MapInfo Pro.
- Likewise, you will need to expressly choose the UTF-8 or UTF-16 character sets. In general we recommend UTF-8, especially if most of the characters in your data are ASCII characters.
Performance improvements in v15.2
SQL Select - filtering data: Queries that filter data sets have increased performance, particularly when the result set (the number of rows returned by the query) is large. The improvement is more dramatic, the larger the result set. Performance varies across a number of factors but for result sets at 20,000 to 25,000 rows (and greater) the improvement starts to become significant.
Note that the two bottom examples were querying a data set of 21.5 GB in size. The size of the result set is a major factor in the performance and the performance improvement. The larger the result set, the longer it used to take earlier versions of MapInfo Pro to complete the query.
Improved redraw performance for point data:
MapInfo Pro v15.2 has improved performance when displaying a large number of point objects. The same computer as used in the tests above was used for these comparisons.
(On-screen) Map querying performance:
The time needed to complete selections with the map selecting tools (rectangle select, radius select, boundary select, polygon select, invert selection) is also improved. As with the query filtering performance mentioned above, the performance for larger result sets is more dramatic as compared to simple selections of only a few objects.
Here is an example that puts together the improved point rendering performance with the map querying performance. Improvements have been made at two points along the workflow saving significant time.
Smart Indexing: The August issue includes an article where we covered this performance improvement. When your data is indexed, certain object editing operations (including update column, deleting data and combining data) finish in less time than before. In some cases, the improvement is very significant.
Smart Indexing has been added to both v15.0 (the latest 32 bit release) and v15.2.
What about raster data?
This article is all about the improvements to working with vector data in MapInfo Pro v15.2. For those of you who work with raster grid data, be sure to check out the series of articles on MapInfo Pro Advanced. This is our next generation raster grid analysis tool for MapInfo Pro. It too can work with very large datasets! Click here for an overview.
In conclusion we hope you like what you see here. We have removed some longstanding limitations in the software to allow you to work with larger data sets.
MapInfo Pro v15.2 is scheduled to go to production on Oct 30th.
Are you using the latest version of MapInfo Pro?
Links to MapInfo Pro free trials in a number of languages can be found here: http://web.pb.com/mapinfopro-archive/resources
Article by Tom Probert, Editor of "The MapInfo Pro" journal
When not writing articles for "The MapInfo Pro" journal, Tom enjoys talking to MapInfo Pro users at conferences and events. When not working he likes to see movies with car chases, explosions and kung-fu fighting.