Let’s say I have this:
char registered = '®';
or an umlaut
, or whatever unicode character. How could I get its code?
asked Jan 5, 2010 at 14:18
Just convert it to int
:
char registered = '®';
int code = (int) registered;
In fact there’s an implicit conversion from char
to int
so you don’t have to specify it explicitly as I’ve done above, but I would do so in this case to make it obvious what you’re trying to do.
This will give the UTF-16 code unit – which is the same as the Unicode code point for any character defined in the Basic Multilingual Plane. (And only BMP characters can be represented as char
values in Java.) As Andrzej Doyle’s answer says, if you want the Unicode code point from an arbitrary string, use Character.codePointAt()
.
Once you’ve got the UTF-16 code unit or Unicode code points, both of which are integers, it’s up to you what you do with them. If you want a string representation, you need to decide exactly what kind of representation you want. (For example, if you know the value will always be in the BMP, you might want a fixed 4-digit hex representation prefixed with U+
, e.g. "U+0020"
for space.) That’s beyond the scope of this question though, as we don’t know what the requirements are.
bn.
7,6417 gold badges39 silver badges54 bronze badges
answered Jan 5, 2010 at 14:20
Jon SkeetJon Skeet
1.4m859 gold badges9091 silver badges9165 bronze badges
9
A more complete, albeit more verbose, way of doing this would be to use the Character.codePointAt method. This will handle ‘high surrogate’ characters, that cannot be represented by a single integer within the range that a char
can represent.
In the example you’ve given this is not strictly necessary – if the (Unicode) character can fit inside a single (Java) char
(such as the registered
local variable) then it must fall within the u0000
to uffff
range, and you won’t need to worry about surrogate pairs. But if you’re looking at potentially higher code points, from within a String/char array, then calling this method is wise in order to cover the edge cases.
For example, instead of
String input = ...;
char fifthChar = input.charAt(4);
int codePoint = (int)fifthChar;
use
String input = ...;
int codePoint = Character.codePointAt(input, 4);
Not only is this slightly less code in this instance, but it will handle detection of surrogate pairs for you.
answered Jan 5, 2010 at 14:25
Andrzej DoyleAndrzej Doyle
102k33 gold badges188 silver badges227 bronze badges
1
In Java, char is technically a “16-bit integer”, so you can simply cast it to int and you’ll get it’s code.
From Oracle:
The char data type is a single 16-bit Unicode character. It has a
minimum value of ‘u0000’ (or 0) and a maximum value of ‘uffff’ (or
65,535 inclusive).
So you can simply cast it to int.
char registered = '®';
System.out.println(String.format("This is an int-code: %d", (int) registered));
System.out.println(String.format("And this is an hexa code: %x", (int) registered));
answered Apr 15, 2013 at 19:16
FelypeFelype
3,0572 gold badges24 silver badges36 bronze badges
1
For me, only “Integer.toHexString(registered)” worked the way I wanted:
char registered = '®';
System.out.println("Answer:"+Integer.toHexString(registered));
This answer will give you only string representations what are usually presented in the tables. Jon Skeet’s answer explains more.
answered Jul 21, 2015 at 12:00
3
There is an open source library MgntUtils that has a Utility class StringUnicodeEncoderDecoder. That class provides static methods that convert any String into Unicode sequence vise-versa. Very simple and useful. To convert String you just do:
String codes = StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence(myString);
For example a String “Hello World” will be converted into
“u0048u0065u006cu006cu006fu0020u0057u006fu0072u006cu0064”
It works with any language. Here is the link to the article that explains all te ditails about the library: MgntUtils. Look for the subtitle “String Unicode converter”. The library could be obtained as a Maven artifact or taken from Github (including source code and Javadoc)
answered May 29, 2018 at 17:26
Michael GantmanMichael Gantman
6,9202 gold badges19 silver badges35 bronze badges
dear friend, Jon Skeet said you can find character Decimal codebut it is not character Hex code as it should mention in unicode, so you should represent character codes via HexCode not in Deciaml.
there is an open source tool at http://unicode.codeplex.com that provides complete information about a characer or a sentece.
so it is better to create a parser that give a char as a parameter and return ahexCode as string
public static String GetHexCode(char character)
{
return String.format("{0:X4}", GetDecimal(character));
}//end
hope it help
Imaky
1,2171 gold badge16 silver badges36 bronze badges
answered Jan 6, 2010 at 13:39
Nasser HadjlooNasser Hadjloo
12.2k15 gold badges69 silver badges100 bronze badges
2
//You can get unicode below
int a = ‘a’;
// ‘a’ is a letter or symbol you want to get its unicode
//You can get symbel or letter below by its unicode
System.out.println(“123”);
//123 is an unicode you want to transfer
answered May 24, 2021 at 14:44
How can I get the UTF8 code of a char in Java ?
I have the char ‘a’ and I want the value 97
I have the char ‘é’ and I want the value 233
here is a table for more values
I tried Character.getNumericValue(a)
but for a it gives me 10 and not 97, any idea why?
This seems very basic but any help would be appreciated!
asked Dec 1, 2010 at 21:22
1
char
is actually a numeric type containing the unicode value (UTF-16, to be exact – you need two char
s to represent characters outside the BMP) of the character. You can do everything with it that you can do with an int
.
Character.getNumericValue()
tries to interpret the character as a digit.
answered Dec 1, 2010 at 21:27
Michael BorgwardtMichael Borgwardt
341k78 gold badges481 silver badges718 bronze badges
0
You can use the codePointAt(int index) method of java.lang.String for that. Here’s an example:
"a".codePointAt(0) --> 97
"é".codePointAt(0) --> 233
If you want to avoid creating strings unnecessarily, the following works as well and can be used for char arrays:
Character.codePointAt(new char[] {'a'},0)
answered Dec 1, 2010 at 21:34
KaitsuKaitsu
4,0843 gold badges30 silver badges37 bronze badges
1
Those “UTF-8” codes are no such thing. They’re actually just Unicode values, as per the Unicode code charts.
So an ‘é’ is actually U+00E9 – in UTF-8 it would be represented by two bytes { 0xc3, 0xa9 }.
Now to get the Unicode value – or to be more precise the UTF-16 value, as that’s what Java uses internally – you just need to convert the value to an integer:
char c = 'u00e9'; // c is now e-acute
int i = c; // i is now 233
answered Dec 1, 2010 at 21:29
Jon SkeetJon Skeet
1.4m859 gold badges9091 silver badges9165 bronze badges
0
This produces good result:
int a = 'a';
System.out.println(a); // outputs 97
Likewise:
System.out.println((int)'é');
prints out 233
.
Note that the first example only works for characters included in the standard and extended ASCII character sets. The second works with all Unicode characters. You can achieve the same result by multiplying the char by 1.
System.out.println( 1 * ‘é’);
answered Dec 1, 2010 at 21:27
RobertasRobertas
1,1643 gold badges11 silver badges26 bronze badges
Your question is unclear. Do you want the Unicode codepoint for a particular character (which is the example you gave), or do you want to translate a Unicode codepoint into a UTF-8 byte sequence?
If the former, then I recommend the code charts at http://www.unicode.org/
If the latter, then the following program will do it:
public class Foo
{
public static void main(String[] argv)
throws Exception
{
char c = 'u00E9';
ByteArrayOutputStream bos = new ByteArrayOutputStream();
OutputStreamWriter out = new OutputStreamWriter(bos, "UTF-8");
out.write(c);
out.flush();
byte[] bytes = bos.toByteArray();
for (int ii = 0 ; ii < bytes.length ; ii++)
System.out.println(bytes[ii] & 0xFF);
}
}
(there’s also an online Unicode to UTF8 page, but I don’t have the URL on this machine)
answered Dec 1, 2010 at 21:30
AnonAnon
2,63416 silver badges10 bronze badges
My method to do it is something like this:
char c = 'c';
int i = Character.codePointAt(String.valueOf(c), 0);
// testing
System.out.println(String.format("%c -> %d", c, i)); // c -> 99
answered Nov 15, 2016 at 18:07
You can create a simple loop to list all the UTF-8 characters available like this:
public class UTF8Characters {
public static void main(String[] args) {
for (int i = 12; i <= 999; i++) {
System.out.println(i +" - "+ (char)i);
}
}
}
answered Jun 6, 2017 at 8:35
connelblazeconnelblaze
7791 gold badge10 silver badges19 bronze badges
There is an open source library MgntUtils that has a Utility class StringUnicodeEncoderDecoder. That class provides static methods that convert any String into Unicode sequence vise-versa. Very simple and useful. To convert String you just do:
String codes = StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence(myString);
For example a String “Hello World” will be converted into
“u0048u0065u006cu006cu006fu0020
u0057u006fu0072u006cu0064”
It works with any language. Here is the link to the article that explains all te ditails about the library: MgntUtils. Look for the subtitle “String Unicode converter”. The article gives you link to Maven Central where you can get artifacts and github where you can get the project itself. The library comes with well written javadoc and source code.
answered Nov 15, 2016 at 18:21
Michael GantmanMichael Gantman
6,9202 gold badges19 silver badges35 bronze badges
ASCII is an acronym that stands for American Standard Code for Information Interchange. In ASCII, a specific numerical value is given to different characters and symbols, for computers to store and manipulate, and while storing and manipulating the electronic device always works with the binary value of the ASCII number given. As it is impossible to do that in the original form.
Approaches: There are 4 ways to print ASCII value or code of a specific character which are listed below briefing the concept followed by a java example for the implementation part.
- Using brute force Method
- Using the type-casting Method
- Using the format specifier Method
- Using Byte class Method
Method 1: Assigning a Variable to the int Variable
In order to find the ASCII value of a character, simply assign the character to a new variable of integer type. Java automatically stores the ASCII value of that character inside the new variable.
Implementation: Brute force Method
Java
public
class
GFG {
public
static
void
main(String[] args)
{
char
ch =
'}'
;
int
ascii = ch;
System.out.println(
"The ASCII value of "
+ ch
+
" is: "
+ ascii);
}
}
Output
The ASCII value of } is: 125
Method 2: Using Type-Casting
Type-casting in java is a way to cast a variable into another datatype which means holding a value of another datatype occupying lesser bytes. In this approach, a character is a typecast of type char to the type int while printing, and it will print the ASCII value of the character.
Java
import
java.util.*;
public
class
GFG {
public
static
void
main(String[] args)
{
char
ch =
'}'
;
System.out.println(
"The ASCII value of "
+ ch
+
" is: "
+ (
int
)ch);
}
}
Output
The ASCII value of } is: 125
Note: In above method 1 and method 2, both the methods are one type of typecasting. In method 1, typecasting is done automatically by the compiler. In method 2, typecasting it manually so the method 2 is much more efficient than method 1 as the compiler has to put lesser effort. Also, remember typecasting done automatically is called implicit typecasting and where it is done from the user end is called explicit typecasting
Method 3: Using format specifier (More Optimal)
In this approach, we generate the ASCII value of the given character with the help of a format specifier. We have stored the value of the given character inside a formal specifier by specifying the character to be an int. Hence, the ASCII value of that character is stored inside the format specifier.
Java
import
java.util.Formatter;
public
class
GFG {
public
static
void
main(String[] args)
{
char
character =
'}'
;
Formatter formatSpecifier =
new
Formatter();
formatSpecifier.format(
"%d"
, (
int
)character);
System.out.println(
"The ASCII value of the character ' "
+ character +
" ' is "
+ formatSpecifier);
}
}
Output
The ASCII value of the character ' } ' is 125
Method 4: Finding the ASCII value by generating byte (Most Optimal)
- Initializing the character as a string.
- Creating an array of type byte by using getBytes() method.
- Printing the element at ‘0’th index of the bytes array.
This is the ASCII value of our character residing at the ‘0’th index of the string. This method is generally used to convert a whole string to their ASCII values. For the characters violating the encoding exception, the try-catch is given.
Java
import
java.io.UnsupportedEncodingException;
public
class
GFG {
public
static
void
main(String[] args)
{
try
{
String sp =
"}"
;
byte
[] bytes = sp.getBytes(
"US-ASCII"
);
System.out.println(
"The ASCII value of "
+ sp.charAt(
0
) +
" is "
+ bytes[
0
]);
}
catch
(UnsupportedEncodingException e) {
System.out.println(
"OOPs!!!UnsupportedEncodingException occurs."
);
}
}
}
Output
The ASCII value of } is 125
Last Updated :
28 Dec, 2021
Like Article
Save Article
Trusted answers to developer questions
Grokking the Behavioral Interview
Many candidates are rejected or down-leveled in technical interviews due to poor performance in behavioral or cultural fit interviews. Ace your interviews with this free course, where you will practice confidently tackling behavioral interview questions.
What are ASCII values?
ASCII assigns letters, numbers, characters, and symbols a slot in the 256 available slots in the 8-bit code.
For example:
Character | ASCII value |
---|---|
a | 97 |
b | 98 |
A | 65 |
B | 66 |
Cast char
to int
Cast a character from the char
data type to the int
data type to give the ASCII value of the character.
Code
In the code below, we assign the character to an int
variable to convert it to its ASCII value.
public class Main {
public static void main(String[] args) {
char ch = 'a';
int as_chi = ch;
System.out.println("ASCII value of " + ch + " is - " + as_chi);
}
}
In the code below, we print the ASCII value of every character in a string by casting it to int
.
public class Main {
public static void main(String[] args) {
String alphabets = "abcdjfre";
for(int i=0;i<alphabets.length();i++){
char ch = alphabets.charAt(i);
System.out.println("ASCII value of " + ch + " is - " + (int)ch);
}
}
}
Trusted Answers to Developer Questions
Learn in-demand tech skills in half the time
Copyright ©2023 Educative, Inc. All rights reserved.
Did you find this helpful?
In this post, we will see How to convert Character to ASCII Numeric Value in Java.
There are multiple ways to convert Character to ASCII Numeric Value in Java
Table of Contents
- By casting char to int
- Using toCharArray()
- Using String’s getBytes()
- Using String’s char() [Java 9+]
- Convert a String of letters to an int of corresponding ascii
By casting char to int
You can simply get char from String using charAt()
and cast it to int.
Here is an example:
package org.arpit.java2blog; public class CharToASCIICast { public static void main(String[] args) { String s=“Hello”; char c=s.charAt(1); int asciiOfE=(int)c; System.out.println(“Ascii value of e is: “+asciiOfE); } } |
Output:
Ascii value of e is: 101
You can even directly assign char to int, but it is good idea to explicitly cast it for readabiliy.
You can change highlighted code to below line and program will still work:
Using toCharArray()
You can simply use index with toCharArray()
to get ASCII value of character in the String.
Here is an example:
package org.arpit.java2blog; public class CharToASCIICast { public static void main(String[] args) { String s=“Hello”; char c=s.toCharArray()[1]; int asciiOfE=(int)c; System.out.println(“Ascii value of e is: “+asciiOfE); } } |
Output:
Ascii value of e is: 101
Using String’s getBytes()
You can convert String to byte array using getBytes(StandardCharsets.US_ASCII)
and this byte array will contain character’s ASCII values. You can access individual value by accessing byte array by index.
Here is an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
package org.arpit.java2blog; import java.nio.charset.StandardCharsets; public class CharToASCIIGetBytes { public static void main(String[] args) { String str = “Hello”; byte[] bytes = str.getBytes(StandardCharsets.US_ASCII); System.out.println(“Ascii value of e is: “+bytes[1]); System.out.println(“ASCII values for all characters are:”); for(byte b:bytes) { System.out.print(b+” “); } } } |
Output:
Ascii value of e is: 101ASCII values for all characters are:72 101 108 108 111
Using String’s char() [Java 9+]
You can convert String to IntStream using String’s chars()
method, use boxed()
to convert it to Stream of wrapper type Integer
and collect to the list. Result list will contain all the ascii value of the characters and you can use index to access individual ASCII value of character.
Here is an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
package org.arpit.java2blog; import java.util.List; import java.util.stream.Collectors; public class CharToASCIIUsingIntStream { public static void main(String[] args) { String str=“Hello”; List<Integer> asciiIntegers = str.chars() .boxed() .collect(Collectors.toList()); System.out.println(“ASCII values for all characters are:”); for(int i:asciiIntegers) { System.out.print(i+” “); } } } |
Output:
ASCII values for all characters are:72 101 108 108 111
Convert a String of letters to an int of corresponding ascii
If you want to convert entire String into concatenated ASCII value of int type, you can create StringBuilder
from String’s ASCII values and convert it to BigInteger.
Here is an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
package org.arpit.java2blog; import java.math.BigInteger; public class CharToASCIIInt { public static void main(String[] args) { String str=“Hello”; StringBuilder sb = new StringBuilder(); for (char ch : str.toCharArray()) { sb.append((int)ch); } BigInteger biAscii = new BigInteger(sb.toString()); System.out.println(biAscii); } } |
Output:
72101108108111
That’s all about Convert Character to ASCII in Java