Example of Charset in Java NIO
December 13, 2013
java.nio.charset.Charset has been introduced in JDK 1.4. Charset plays the role of encoding and decoding between given charset and UNICODE. Charset name should follow some rule. It must begin with a letter or number. Charset has the canonical name as an alias and java supports it. Charset methods are safe to use in multithreading environment.
Standard charsets
The supported Charset in java are given below.US-ASCII: Seven bit ASCII characters.
ISO-8859-1: ISO Latin alphabet
UTF-8: This is 8 bit UCS transformation format.
UTF-16BE: This is 16 bit UCS transformation format with big endian byte order
UTF-16LE: This is 16 bit UCS transformation with little endian byte order.
UTF-16: 16 bit UCS transformation format.
Charset.forName() in Java NIO
Creates a charset object for the given charset name. The name can be canonical or an alias.Charset.displayName() in Java NIO
This method returns canonical name of charset.Charset.canEncode() in Java NIO
This method checks whether the given charset supports encoding or not.Charset.decode() in Java NIO
This method decodes the string of a given charset into charbuffer of Unicode charset.Charset.encode() in Java NIO
This method encodes charbuffer of unicode charset into the byte buffer of given charset.CharsetExample.java
package com.concretepage.nio.charset; import java.nio.ByteBuffer; import java.nio.CharBuffer; import java.nio.charset.Charset; public class CharsetExample { public static void main(String[] args) { Charset charset=Charset.forName("US-ASCII"); System.out.println(charset.displayName()); System.out.println(charset.canEncode()); String s= "Hello, This is Charset Example."; //convert byte buffer in given charset to char buffer in unicode ByteBuffer bb = ByteBuffer.wrap(s.getBytes()); CharBuffer cb = charset.decode(bb); //convert char buffer in unicode to byte buffer in given charset ByteBuffer newbb = charset.encode(cb); while(newbb.hasRemaining()){ char ch = (char) newbb.get(); System.out.print(ch); } newbb.clear(); } }
Output
US-ASCII Hello, This is Charset Example.