Subscribe to this thread
Home - General / All posts - Import table unexpected date format issue
Graeme

995 post(s)
#24-May-24 06:11

I downloaded and imported the May 2024 release of GNAF-Core from Australian PSMA. It is a text based PSV file and imported direct to M9 9.0.181. It includes a field "DATE_CREATED" Several records display as default 12/01/2024, but when the style is changed to "en-AU" English Australian, they display and query in M9 as being 01 December 2024. Thinking this may have been a data entry error, I raised it with the data supplier, who replied in part:

<!--[if gte mso 9]> <![endif]--><!--[if gte mso 9]> Normal 0 false false false EN-AU X-NONE X-NONE <![endif]--><!--[if gte mso 9]> <![endif]--><!--[if gte mso 10]> /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman",serif;} <![endif]-->

GNAF Core is a text file and the dates are provided in the format dd-mm-yyyy

We can find many examples of 12-01-2024but can't find any with 01-12-2024. Any instances of “1/12/2024” you are seeing is likely to be a translation error from the application you are using to open the GNAF_CORE.psv file.

The records in question I have exported from M9 and attach.

Attachments:
GNAF_CORE_May2024_date error.psv

Dimitri


7,479 post(s)
#25-May-24 06:31

Key question: What is the actual date in the data set? Is it the first day in December 2024 or is it the twelfth day in January 2024? Given that we are not yet in December of 2024 I'm guessing that the authors of GNAF Core intended a date of 12 January 2024.

If they were using a sensible, modern interchange format, you wouldn't have to ask that question, but text formats like CSV/PSV are liable to all sorts of chaos, for example, because plain text does not unambiguously define data types like datetime values.

Different countries use different date formats (month day year, day month year, etc.) so a dumb text format like CSV/PSV does not in and of itself say what those dates are supposed to be. You have to guess at it, perhaps using clues like what the user has picked for a system language on his or her computer. There's a similar problem in CSV/PSV with the never-ending fight between use of dot . characters or comma , characters as decimal separators in numbers.

Start by reading the main CSV topic, carefully noting the advice on Unicode and also in the notes:

Dates- Different languages have different ways for expression dates. When recognizing date values as dates, Manifold is guided by whatever languages are specified in the Help - Aboutdialog in the Langentry, the presumption being that the date formats in use are those employed by whatever language is in use. Reading a CSV file allows date fields to use multiple different languages. For example, one field can use US English while another field can use German.

So, from the above if the data set you are using represents dates in text form using the traditional Australian convention giving a date in day month year order, and if you also have specified en-AU as the Lang entry, well, then there are good odds that the CSV importer will correctly and automatically transform the Australian convention text values used for dates into genuine datetime values in the table.

As to the difference between what actually is the datetime value and how it is styled, read the Styling Table Fields and Style: Datetime Fields topics for useful info like:

Styling a field does not change either the field's data type or the contents of that field. It just changes the way the contents of the field are displayed in a table window, the Info pane, Labels, or other settings where the table values are displayed. For example, styling a datetime field that contains the value 05/19/2016 07:27:37to use the D(date) option so it displays as Thursday, 19 May 2016does not convert the datetime field into a text field so that the day of the week and the month can be spelled out, nor does it truncate the datetime value to remove the time portion. It simply shows the datetime value using a different display format.

and

The Style dialog provides an extensive selection of different date and time styles, which will automatically adapt to language-country choices specified in the Languagesetting. Over 600 languages are supported by Manifold in Windows 10, with automatic use of correct language names for days of the week and months and automatic use of the correct cultural format for the country.

The key thing is to first import the data correctly. If your original file uses some local date format convention then it is essential your Lang setting matches that, to give the automatic matching of formats a chance to work correctly. After that, once you import your data you can Style it to appear however you want, using whatever local language style you want.

One last thing: you might not have any choice in the matter, but if you can, get the GNAF Core data in some modern format, not PSV. GPKG/SQLite is often an option. Then there's no ambiguity about the datetimes or any other field.

Graeme

995 post(s)
#03-Jun-24 08:59

Thanks for the considered reply and suggestions. I did find a W11 system date setting, which defaulted to US_English, changed and restarted. Retried the import, but still saw the anomaly. Will pass on the suggestion about none text formats for data supply.

Graeme

995 post(s)
#28-Aug-24 07:40

I've just run the import of the latest quarterly update from GNAF_core August 2024. The source data is still in text - .psv - format. In M9 9.0.181.0 the "DATE_CREATED" field is still reporting "maximum" (select top 10) as a future date "first of December 2024" (style set to Eng_AU to present as dd/mm/yyyy). As previously reported, these are existing addresses, not planned addresses, so a future date isn't possible.

To double check, I imported the same data to M8 and performed the same "select top 10". M8 has returned a result in line with expected date range for the August 2024 update release - 18 July 2024. See result screenshots.

Attachments:
M8 Date fortmat top 10 result Aug 2024.jpg
M9 Date fortmat top 10 result Aug 2024.jpg

sga33 post(s)
#04-Sep-24 07:54

Thanks Dimitri for the detailed answer, as always!

I perfectly get the point that date format is a mess and personally, when importing data with date in it, I change the format in excel to YYYY-MM-DD and that way I never have problems.

Yet, what I find really confusing in manifold is when you try to import dates with DD-MM-YYYY format and say you have two dates

  • 12-01-2024 (i.e. 12th of January)
  • 25-01-2024 (i.e. 25th of January)

Manifold will transform to MM/DD/YYYY but will have a problem with the second date and keep the DD/MM/YYYY. With the two previous examples, this will result in

  • 12/01/2024 (i.e. 1st of December)
  • 01/25/2024 (i.e. 25th of January)

The fact that manifold treats all records separately is very confusing to me. In my opinion, it should either raise an error saying that the 25th month of the year does not exist, or switch months and days for all rows.

Once imported, there is no way to correct this problem as you don't know if the months and days have been switched.

Dimitri


7,479 post(s)
#04-Sep-24 13:59

Without knowing all details of what you are importing and how I can't comment on what's going on in your case. If you think there's a bug, by all means don't hesitate to send in a bug report (https://manifold.net/tech/bug_reports.shtml).

In my experience the various dataports try different strategies to guess at how to import data, and they don't always guess right. There are also issues involved in guessing locale, or using Windows locale to guide imports, that can cause some variability in how imports happen. I've never seen it handle dates differently that are in the same field, unless there is some significant variation in the contents of the field, say, if it is a text field and describes dates differently. So there's a lot of detail involved in investigating exactly why what happens happens in a particular case.

Graeme

995 post(s)
#05-Sep-24 23:11

Bug report submitted.

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.