annotate src/org/tmatesoft/hg/internal/EncodingHelper.java @ 440:299870249a28

Issue 30: bogus IOException for mmap file on linux
author Artem Tikhomirov <tikhomirov.artem@gmail.com>
date Thu, 19 Apr 2012 19:18:25 +0200
parents 528b6780a8bd
children 909306e412e2
rev   line source
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
1 /*
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
2 * Copyright (c) 2011-2012 TMate Software Ltd
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
3 *
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
4 * This program is free software; you can redistribute it and/or modify
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
5 * it under the terms of the GNU General Public License as published by
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
6 * the Free Software Foundation; version 2 of the License.
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
7 *
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
8 * This program is distributed in the hope that it will be useful,
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
9 * but WITHOUT ANY WARRANTY; without even the implied warranty of
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
10 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
11 * GNU General Public License for more details.
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
12 *
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
13 * For information on how to redistribute this software under
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
14 * the terms of a license other than GNU General Public License
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
15 * contact TMate Software at support@hg4j.com
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
16 */
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
17 package org.tmatesoft.hg.internal;
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
18
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
19 import java.nio.ByteBuffer;
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
20 import java.nio.CharBuffer;
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
21 import java.nio.charset.CharacterCodingException;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
22 import java.nio.charset.Charset;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
23 import java.nio.charset.CharsetDecoder;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
24 import java.nio.charset.CharsetEncoder;
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
25
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
26 import org.tmatesoft.hg.core.SessionContext;
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
27
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
28 /**
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
29 * Keep all encoding-related issues in the single place
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
30 * NOT thread-safe (encoder and decoder requires synchronized access)
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
31 * @author Artem Tikhomirov
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
32 * @author TMate Software Ltd.
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
33 */
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
34 public class EncodingHelper {
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
35 // XXX perhaps, shall not be full of statics, but rather an instance coming from e.g. HgRepository?
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
36 /*
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
37 * To understand what Mercurial thinks of UTF-8 and Unix byte approach to names, see
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
38 * http://mercurial.808500.n3.nabble.com/Unicode-support-request-td3430704.html
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
39 */
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
40
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
41 private final SessionContext sessionContext;
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
42 private final CharsetEncoder encoder;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
43 private final CharsetDecoder decoder;
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
44
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
45 EncodingHelper(Charset fsEncoding, SessionContext ctx) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
46 sessionContext = ctx;
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
47 decoder = fsEncoding.newDecoder();
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
48 encoder = fsEncoding.newEncoder();
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
49 }
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
50
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
51 /**
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
52 * Translate file names from manifest to amazing Unicode string
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
53 */
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
54 public String fromManifest(byte[] data, int start, int length) {
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
55 return decodeWithSystemDefaultFallback(data, start, length);
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
56 }
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
57
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
58 /**
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
59 * @return byte representation of the string directly comparable to bytes in manifest
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
60 */
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
61 public byte[] toManifest(String s) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
62 if (s == null) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
63 // perhaps, can return byte[0] in this case?
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
64 throw new IllegalArgumentException();
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
65 }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
66 try {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
67 // synchonized(encoder) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
68 ByteBuffer bb = encoder.encode(CharBuffer.wrap(s));
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
69 // }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
70 byte[] rv = new byte[bb.remaining()];
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
71 bb.get(rv, 0, rv.length);
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
72 return rv;
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
73 } catch (CharacterCodingException ex) {
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
74 sessionContext.getLog().error(getClass(), ex, String.format("Use of charset %s failed, resort to system default", charset().name()));
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
75 // resort to system-default
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
76 return s.getBytes();
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
77 }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
78 }
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
79
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
80 /**
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
81 * Translate file names from dirstate to amazing Unicode string
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
82 */
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
83 public String fromDirstate(byte[] data, int start, int length) {
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
84 return decodeWithSystemDefaultFallback(data, start, length);
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
85 }
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
86
418
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
87 private String decodeWithSystemDefaultFallback(byte[] data, int start, int length) {
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
88 try {
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
89 return decoder.decode(ByteBuffer.wrap(data, start, length)).toString();
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
90 } catch (CharacterCodingException ex) {
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
91 sessionContext.getLog().error(getClass(), ex, String.format("Use of charset %s failed, resort to system default", charset().name()));
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
92 // resort to system-default
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
93 return new String(data, start, length);
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
94 }
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
95 }
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
96
528b6780a8bd A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 415
diff changeset
97 private Charset charset() {
412
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
98 return encoder.charset();
63c5a9d7ca3f Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 321
diff changeset
99 }
415
ee8264d80747 Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents: 412
diff changeset
100
320
678e326fd27c Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff changeset
101 }