I'm working on a csv import script in php. It works fine, except for foreign characters in the beginning of a field.

The code looks like this

if (($handle = fopen($filename, "r")) !== FALSE)
{
     while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) 
         $teljing[] = $data;

     fclose($handle);
}

Here is a data example showing my issue

føroyskir stavir, "Kr. 201,50"
óvirkin ting, "Kr. 100,00"

This will result in the following

array 
(
     [0] => array 
          (
                 [0] => 'føroyskir stavir',
                 [1] => 'Kr. 201,50'
          )
     [1] => array 
          (
                 [0] => 'virkin ting', <--- Should be 'óvirkin ting'
                 [1] => 'Kr. 100,00'
          )
)

I have seen this behaivior documented in some comments in php.net, and I have tried ini_set('auto_detect_line_endings',TRUE); to detect line endings. No success.

Anyone familiar with this issue?

Edit:

Thanks you AJ, this issue is now solved.

setlocale(LC_ALL, 'en_US.UTF-8');

Was the solution.

Comments

what happens if you call fopen($filename, "rb")

Written by AJ

@AJ Exactly the same - I cannot see any difference in the results

Written by Ragnar123

from the PHP manual for fgetcsv(): "Note: Locale setting is taken into account by this function. If LANG is e.g. en_US.UTF-8, files in one-byte encoding are read wrong by this function."

Written by AJ

@AJ - Using setlocale(LC_ALL, 'en_US.UTF-8'); worked. Thank you.

Written by Ragnar123

glad it worked, I'll post below as answer.

Written by AJ

Accepted Answer

From the PHP manual for fgetcsv():

"Note: Locale setting is taken into account by this function. If LANG is e.g. en_US.UTF-8, files in one-byte encoding are read wrong by this function."

Written by AJ
This page was build to provide you fast access to the question and the direct accepted answer.
The content is written by members of the stackoverflow.com community.
It is licensed under cc-wiki