Friday, December 13, 2013

Convert string to ascii and ascii to string using JAVA

Java code to convert string to ascii code and ascii code to string back

Java Regex to convert string to ascii code and ascii code to string back



/**
 *
 * @author Pritom K Mondal
 */
public class AsciiString {
    public static void main(String[] args) {
        String inputString = "Hi, THAI(คุณ), HINDI:(तुम मेरी हो), HANGERIO:(تو مال منی), CHINA:(您), ARBI(أنت), FARSI(شما)";
        System.out.println("ORIGINAL:      " + inputString);
        String encoded = AsciiString.encode(inputString);
        System.out.println("ASCII ENCODED: " + encoded);
        String decoded = AsciiString.decode(encoded);
        System.out.println("ASCII DECODED: " + decoded);
    }
    
    public static String encode(String word) {
        String encoded = "";
        for(Integer index = 0; index < word.length(); index++) {
            int ascii = (int) word.charAt(index);
            Boolean keepAscii = true;
            if(ascii >= 48 && ascii <= 57) {
                keepAscii = false;
            }
            if(ascii >= 65 && ascii <= 90) {
                keepAscii = false;
            }
            if(ascii >= 97 && ascii <= 122) {
                keepAscii = false;
            }
            if(ascii == 32 || ascii == 43 || ascii == 45 || ascii == 46) {
                keepAscii = false;
            }
            if(keepAscii) {
                encoded += "&#" + ascii + ";";
            } else {
                encoded += word.charAt(index);
            }
        }
        return encoded;
    }
    
    public static String decode(String word) {
        String decoded = "";
        for(Integer index = 0; index < word.length(); index++) {
            String charAt = "" + word.charAt(index);
            if(charAt.equals("&") && index < word.length() && ("" + word.charAt(index + 1)).equals("#")) {
                try {
                    Integer length = word.indexOf(";", index);
                    String sub = word.substring(index + 2, length);
                    decoded += Character.toString((char) Integer.parseInt(sub));
                    index = length;
                } catch (Exception ex) {
                    decoded += charAt;
                }
            } else {
                decoded += charAt;
            }
        }
        return decoded;
    }
}

Output for the above program is as following:


ORIGINAL:      Hi, THAI(คุณ), HINDI:(तुम मेरी हो), HANGERIO:(تو مال منی), CHINA:(您), ARBI(أنت), FARSI(شما)
ASCII ENCODED: Hi&#44; THAI&#40;&#3588;&#3640;&#3603;&#41;&#44; HINDI&#58;&#40;&#2340;&#2369;&#2350; &#2350;&#2375;&#2352;&#2368; &#2361;&#2379;&#41;&#44; HANGERIO&#58;&#40;&#1578;&#1608; &#1605;&#1575;&#1604; &#1605;&#1606;&#1740;&#41;&#44; CHINA&#58;&#40;&#24744;&#41;&#44; ARBI&#40;&#1571;&#1606;&#1578;&#41;&#44; FARSI&#40;&#1588;&#1605;&#1575;&#41;
ASCII DECODED: Hi, THAI(คุณ), HINDI:(तुम मेरी हो), HANGERIO:(تو مال منی), CHINA:(您), ARBI(أنت), FARSI(شما)

Now create a html file named 'index.html' such:


<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
        <title>String and ASCII</title>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
    </head>
    <body>
        <div>ORIGINAL:      Hi, THAI(คุณ), HINDI:(तुम मेरी हो), HANGERIO:(تو مال منی), CHINA:(您), ARBI(أنت), FARSI(شما)</div>
        <div>ASCII ENCODED: Hi&#44; THAI&#40;&#3588;&#3640;&#3603;&#41;&#44; HINDI&#58;&#40;&#2340;&#2369;&#2350; &#2350;&#2375;&#2352;&#2368; &#2361;&#2379;&#41;&#44; HANGERIO&#58;&#40;&#1578;&#1608; &#1605;&#1575;&#1604; &#1605;&#1606;&#1740;&#41;&#44; CHINA&#58;&#40;&#24744;&#41;&#44; ARBI&#40;&#1571;&#1606;&#1578;&#41;&#44; FARSI&#40;&#1588;&#1605;&#1575;&#41;</div>
        <div>ASCII DECODED: Hi, THAI(คุณ), HINDI:(तुम मेरी हो), HANGERIO:(تو مال منی), CHINA:(您), ARBI(أنت), FARSI(شما)</div>
    </body>
</html>

Now showing in browser:


In browser they are appearing same. Such original string, after ascii conversion and the back to again string. So why you convert them?
  • It is easy to maintain them.
  • Easy to insert to and get from database.
  • You have no worry about what characters are you inserting to database.
  • If you create xml from those data, you do not need to worry about what characters support in xml file, specially when you transport data via api server.

No comments:

Post a Comment