[R-lang] Re: writing Unicode files from R

Patrick Bolger pbolger@ualberta.ca
Wed Oct 20 14:55:07 PDT 2010


For entirely unrelated reasons, I've always had difficulty with Excel and Unicode. Open Office works great though! You might try that.

-pat

On 2010-10-20, at 10:53 AM, Scott Jackson wrote:

> Thanks to everyone for responses so far!
> 
> I have tried both Lucien's and Nathaniel's suggestions with no luck.
> I'm in Windows XP, by the way, which may be making the difference, I'm
> not sure.  Their strategies (as well as every other thing I've tried)
> has ended up with strings like <U+0430> in the output, even if I open
> the file as UTF-8 within Excel, Notepad, etc.
> 
> The site John recommended does seem to work quite nicely for my
> current purposes.  It ends up not doing anything with the angle
> brackets, but it's easy enough to get rid of those once the Unicode
> (Cyrillic) is displayed correctly.  Not the most efficient procedure
> overall, but it (eventually) gets the right output, and that's the
> first time I've managed that.
> 
> Thanks again for the fast and helpful responses, and if anyone can
> help sort out why Lucien's and Nathaniel's techniques may not work for
> everyone (at least not for me), I'd love to hear it, since it seems
> like they should work, and either would be a much simpler solution if
> they did work.
> 
> -scott
> 
> On Wed, Oct 20, 2010 at 10:58 AM, Lucien Carroll <lucien@ling.ucsd.edu> wrote:
>> Hi,
>> 
>> To write a file in a particular encoding I found I needed to pass a
>> file connection with the specified encoding, rather than just the
>> filename.
>> 
>>> write.csv(pass.len,(con <- file("pass_len.csv", "w", encoding="UTF-8"))); close(con)
>> 
>> ~Lucien
>> 
>> On Wed, Oct 20, 2010 at 6:25 AM, Scott Jackson <scottuba@gmail.com> wrote:
>>> hi R-langers,
>>> 
>>> Has anyone worked with getting Unicode *out* of R?  I've had some
>>> success getting Unicode that began life in an Excel doc, text file,
>>> etc. into R and getting R to display the Unicode characters (e.g.,
>>> Arabic, Cyrillic) in the GUI and in plots.  However, whenever I've
>>> wanted to manipulate some data that has some Unicode text and get it
>>> out of R into a file that could be then opened in Excel and read by
>>> humans, I've gotten stuck.  Using write.table() or write() or cat() or
>>> even writeClipboard() have all resulted in ASCII renderings of the
>>> Unicode (e.g., <U+0001>), which I can't manage to get back into their
>>> human-readable Unicode characters in Excel, Notepad, or other text
>>> editors.
>>> 
>>> Note I'm not just interested in printing out a *rendering* of the
>>> Unicode output (like you might be able to get via a
>>> Sweave/LaTeX-generated PDF), but actually spit out a file from R that
>>> can be opened and worked with in Excel (or any other Unicode-reading
>>> program), in such a way that the Unicode is displayed correctly.
>>> 
>>> My latest frustration is that I can get the Unicode to display in the
>>> Rgui, and if I literally highlight, copy & paste from the Rgui into
>>> Excel, it looks fine.  But that's not a workable solution for what I
>>> really want to do.  So it feels like I'm just one simple, obvious step
>>> away...
>>> 
>>> anyone have any ideas/suggestions?
>>> 
>>> thanks,
>>> -scott
>>> 
>> 
>> 
>> 
>> --
>> Lucien S. Carroll
>> Graduate Student
>> UCSD Linguistics
>> http://ling.ucsd.edu/~lucien
>> 
> 

--------------
Patrick Bolger, Ph.D.
Assistant Professor
Department of Linguistics
University of Alberta







More information about the ling-r-lang-L mailing list