龙空技术网

超实用:Mac OS X下各种文件编码的转换方法

贫困的富七代 115

前言:

现时你们对“苹果电脑怎么写java代码的文件夹名称”大致比较看重,我们都想要知道一些“苹果电脑怎么写java代码的文件夹名称”的相关知识。那么小编也在网上收集了一些关于“苹果电脑怎么写java代码的文件夹名称””的相关知识,希望我们能喜欢,同学们快快来了解一下吧!

何曾几时本猫还在windows下编码的时候,那时ruby的源代码的编码格式都是gbk啊!导致N多中文显示为乱码。后来无奈写了个转换代码从gbk编码转为utf-8格式的小工具:

#!/usr/bin/ruby#tool 4 gbk encoding to utf8  src_path = $*[0]unless src_path	puts "usage #{$0[2..-1]} gbk_file"	exit 1end dir_name,base_name = File.split(src_path)dst_path = dir_name << '/u8_' << base_namef_src = File.open(src_path,"r:gbk")f_dst = File.open(dst_path,"w:utf-8") f_src.each_with_index do |line,i|	line.encode!("utf-8")	if(i < 2)		#line.gsub!(/gbk/,"utf-8") if(line =~ /^#[ ]*coding*/)		line.gsub!(/gbk/,"utf-8") if(line =~ /^*coding*/)	end	f_dst.puts lineend f_src.closef_dst.close`chmod +x #{dst_path}`

虽说ruby足够小巧,但也是要十几行代码才能搞定啊,而且这只能转换一种格式.

大救星iconv

再后来,我发现mac系统下竟然自带iconv这个好东东啊:

ICONV(1) Linux Programmer's Manual ICONV(1)NAME iconv - character set conversionSYNOPSIS iconv [OPTION...] [-f encoding] [-t encoding] [inputfile ...] iconv -lDESCRIPTION The iconv program converts text from one encoding to another encoding. More precisely, it converts from the encoding given for the -f option to the encoding given for the -t option. Either of these encodings defaults to the encoding of the current locale. All the inputfiles are read and converted in turn; if no inputfile is given, the standard input is used. The converted text is printed to standard output. The encodings permitted are system dependent. For the libiconv imple- mentation, they are listed in the iconv_open(3) manual page.

我们来试一下,创建一个utf-8格式的文本:

路人甲:最近又多学了德语,现在懂中文,英语和德语啊

猫猫:靠,我早精通十几门语言了

路人甲:擦,我才不信

猫猫:汇编语言,C语言,C++语言,C#语言,ruby语言,javascript语言...

路人甲:...

用iconv转换为gbk格式(或者反向转换也可以):

apple@kissAir: ruby_src$iconv -f UTF-8 -t GBK ex_u8.txt > ex_gbk.txt

apple@kissAir: ruby_src$cat ex_gbk.txt

·?˼?:????ֶ?ѧ?˵?????ڶ????ģ?Ӣ??͵??ﰡ

èè?????????羫ͨʮ??????????

·?˼ף??????ҲŲ???

èè?????????,C????,C++???ԣ?C#????,ruby????,javascript????...

·?˼?:...apple@kissAir: ruby_src$

虽说是乱码,但的确是货真价实的gbk格式啊!

最后来看一下iconv到底支持多少种编码格式,貌似是超多的哦:

apple@kissAir: ruby_src$iconv -lANSI_X3.4-1968 ANSI_X3.4-1986 ASCII CP367 IBM367 ISO-IR-6 ISO646-US ISO_646.IRV:1991 US US-ASCII CSASCIIUTF-8UTF-8-MAC UTF8-MACISO-10646-UCS-2 UCS-2 CSUNICODEUCS-2BE UNICODE-1-1 UNICODEBIG CSUNICODE11UCS-2LE UNICODELITTLEISO-10646-UCS-4 UCS-4 CSUCS4UCS-4BEUCS-4LEUTF-16UTF-16BEUTF-16LEUTF-32UTF-32BEUTF-32LEUNICODE-1-1-UTF-7 UTF-7 CSUNICODE11UTF7UCS-2-INTERNALUCS-2-SWAPPEDUCS-4-INTERNALUCS-4-SWAPPEDC99JAVACP819 IBM819 ISO-8859-1 ISO-IR-100 ISO8859-1 ISO_8859-1 ISO_8859-1:1987 L1 LATIN1 CSISOLATIN1ISO-8859-2 ISO-IR-101 ISO8859-2 ISO_8859-2 ISO_8859-2:1987 L2 LATIN2 CSISOLATIN2ISO-8859-3 ISO-IR-109 ISO8859-3 ISO_8859-3 ISO_8859-3:1988 L3 LATIN3 CSISOLATIN3ISO-8859-4 ISO-IR-110 ISO8859-4 ISO_8859-4 ISO_8859-4:1988 L4 LATIN4 CSISOLATIN4CYRILLIC ISO-8859-5 ISO-IR-144 ISO8859-5 ISO_8859-5 ISO_8859-5:1988 CSISOLATINCYRILLICARABIC ASMO-708 ECMA-114 ISO-8859-6 ISO-IR-127 ISO8859-6 ISO_8859-6 ISO_8859-6:1987 CSISOLATINARABICECMA-118 ELOT_928 GREEK GREEK8 ISO-8859-7 ISO-IR-126 ISO8859-7 ISO_8859-7 ISO_8859-7:1987 ISO_8859-7:2003 CSISOLATINGREEKHEBREW ISO-8859-8 ISO-IR-138 ISO8859-8 ISO_8859-8 ISO_8859-8:1988 CSISOLATINHEBREWISO-8859-9 ISO-IR-148 ISO8859-9 ISO_8859-9 ISO_8859-9:1989 L5 LATIN5 CSISOLATIN5ISO-8859-10 ISO-IR-157 ISO8859-10 ISO_8859-10 ISO_8859-10:1992 L6 LATIN6 CSISOLATIN6ISO-8859-11 ISO8859-11 ISO_8859-11ISO-8859-13 ISO-IR-179 ISO8859-13 ISO_8859-13 L7 LATIN7ISO-8859-14 ISO-CELTIC ISO-IR-199 ISO8859-14 ISO_8859-14 ISO_8859-14:1998 L8 LATIN8ISO-8859-15 ISO-IR-203 ISO8859-15 ISO_8859-15 ISO_8859-15:1998 LATIN-9ISO-8859-16 ISO-IR-226 ISO8859-16 ISO_8859-16 ISO_8859-16:2001 L10 LATIN10KOI8-R CSKOI8RKOI8-UKOI8-RUCP1250 MS-EE WINDOWS-1250CP1251 MS-CYRL WINDOWS-1251CP1252 MS-ANSI WINDOWS-1252CP1253 MS-GREEK WINDOWS-1253CP1254 MS-TURK WINDOWS-1254CP1255 MS-HEBR WINDOWS-1255CP1256 MS-ARAB WINDOWS-1256CP1257 WINBALTRIM WINDOWS-1257CP1258 WINDOWS-1258850 CP850 IBM850 CSPC850MULTILINGUAL862 CP862 IBM862 CSPC862LATINHEBREW866 CP866 IBM866 CSIBM866CP1131MAC MACINTOSH MACROMAN CSMACINTOSHMACCENTRALEUROPEMACICELANDMACCROATIANMACROMANIAMACCYRILLICMACUKRAINEMACGREEKMACTURKISHMACHEBREWMACARABICMACTHAIHP-ROMAN8 R8 ROMAN8 CSHPROMAN8NEXTSTEPARMSCII-8GEORGIAN-ACADEMYGEORGIAN-PSKOI8-TCP154 CYRILLIC-ASIAN PT154 PTCP154 CSPTCP154KZ-1048 RK1048 STRK1048-2002 CSKZ1048MULELAO-1CP1133 IBM-CP1133ISO-IR-166 TIS-620 TIS620 TIS620-0 TIS620.2529-1 TIS620.2533-0 TIS620.2533-1CP874 WINDOWS-874VISCII VISCII1.1-1 CSVISCIITCVN TCVN-5712 TCVN5712-1 TCVN5712-1:1993ISO-IR-14 ISO646-JP JIS_C6220-1969-RO JP CSISO14JISC6220ROJISX0201-1976 JIS_X0201 X0201 CSHALFWIDTHKATAKANAISO-IR-87 JIS0208 JIS_C6226-1983 JIS_X0208 JIS_X0208-1983 JIS_X0208-1990 X0208 CSISO87JISX0208ISO-IR-159 JIS_X0212 JIS_X0212-1990 JIS_X0212.1990-0 X0212 CSISO159JISX02121990CN GB_1988-80 ISO-IR-57 ISO646-CN CSISO57GB1988CHINESE GB_2312-80 ISO-IR-58 CSISO58GB231280CN-GB-ISOIR165 ISO-IR-165ISO-IR-149 KOREAN KSC_5601 KS_C_5601-1987 KS_C_5601-1989 CSKSC56011987EUC-JP EUCJP EXTENDED_UNIX_CODE_PACKED_FORMAT_FOR_JAPANESE CSEUCPKDFMTJAPANESEMS_KANJI SHIFT-JIS SHIFT_JIS SJIS CSSHIFTJISCP932ISO-2022-JP CSISO2022JPISO-2022-JP-1ISO-2022-JP-2 CSISO2022JP2CN-GB EUC-CN EUCCN GB2312 CSGB2312GBKCP936 MS936 WINDOWS-936GB18030ISO-2022-CN CSISO2022CNISO-2022-CN-EXTHZ HZ-GB-2312EUC-TW EUCTW CSEUCTWBIG-5 BIG-FIVE BIG5 BIGFIVE CN-BIG5 CSBIG5CP950BIG5-HKSCS:1999BIG5-HKSCS:2001BIG5-HKSCS:2004BIG5-HKSCS BIG5-HKSCS:2008 BIG5HKSCSEUC-KR EUCKR CSEUCKRCP949 UHCCP1361 JOHABISO-2022-KR CSISO2022KRCP856CP922CP943CP1046CP1124CP1129CP1161 IBM-1161 IBM1161 CSIBM1161CP1162 IBM-1162 IBM1162 CSIBM1162CP1163 IBM-1163 IBM1163 CSIBM1163DEC-KANJIDEC-HANYU437 CP437 IBM437 CSPC8CODEPAGE437CP737CP775 IBM775 CSPC775BALTIC852 CP852 IBM852 CSPCP852CP853855 CP855 IBM855 CSIBM855857 CP857 IBM857 CSIBM857CP858860 CP860 IBM860 CSIBM860861 CP-IS CP861 IBM861 CSIBM861863 CP863 IBM863 CSIBM863CP864 IBM864 CSIBM864865 CP865 IBM865 CSIBM865869 CP-GR CP869 IBM869 CSIBM869CP1125EUC-JIS-2004 EUC-JISX0213SHIFT_JIS-2004 SHIFT_JISX0213ISO-2022-JP-2004 ISO-2022-JP-3BIG5-2003ISO-IR-230 TDS565ATARI ATARISTRISCOS-LATIN1
结尾的话

最后说点题外话,夸一下UNIX系统的整体性和统一性,这种统一性带来学习成本的急剧下降,而且让人很有成就感。比如我在ruby中知道正则表达式最后加i表示忽略大小写,我有次用grep查找的时候发觉也要忽略大小写查找,你猜猜我用神马选项:grep -i xxx,就是这么统一,这么和谐。windows下可以吗?哦,对了windows下人家不玩console,人家都玩窗口...

标签: #苹果电脑怎么写java代码的文件夹名称