Mercurial > hg4j
annotate src/org/tmatesoft/hg/internal/EncodingHelper.java @ 510:90093ee56c0d
Full-fledged test repo to follow file history. Investigating iteration direction alternatives (from new to old in addition to existing old to new)
author | Artem Tikhomirov <tikhomirov.artem@gmail.com> |
---|---|
date | Thu, 13 Dec 2012 15:46:40 +0100 (2012-12-13) |
parents | 909306e412e2 |
children | 0be5be8d57e9 |
rev | line source |
---|---|
320
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
1 /* |
412
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
2 * Copyright (c) 2011-2012 TMate Software Ltd |
320
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
3 * |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
4 * This program is free software; you can redistribute it and/or modify |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
5 * it under the terms of the GNU General Public License as published by |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
6 * the Free Software Foundation; version 2 of the License. |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
7 * |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
8 * This program is distributed in the hope that it will be useful, |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
9 * but WITHOUT ANY WARRANTY; without even the implied warranty of |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
10 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
11 * GNU General Public License for more details. |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
12 * |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
13 * For information on how to redistribute this software under |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
14 * the terms of a license other than GNU General Public License |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
15 * contact TMate Software at support@hg4j.com |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
16 */ |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
17 package org.tmatesoft.hg.internal; |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
18 |
456
909306e412e2
Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
418
diff
changeset
|
19 import static org.tmatesoft.hg.util.LogFacility.Severity.Error; |
909306e412e2
Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
418
diff
changeset
|
20 |
412
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
21 import java.nio.ByteBuffer; |
415
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
22 import java.nio.CharBuffer; |
412
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
23 import java.nio.charset.CharacterCodingException; |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
24 import java.nio.charset.Charset; |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
25 import java.nio.charset.CharsetDecoder; |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
26 import java.nio.charset.CharsetEncoder; |
320
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
27 |
415
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
28 import org.tmatesoft.hg.core.SessionContext; |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
29 |
320
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
30 /** |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
31 * Keep all encoding-related issues in the single place |
415
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
32 * NOT thread-safe (encoder and decoder requires synchronized access) |
320
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
33 * @author Artem Tikhomirov |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
34 * @author TMate Software Ltd. |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
35 */ |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
36 public class EncodingHelper { |
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
37 // XXX perhaps, shall not be full of statics, but rather an instance coming from e.g. HgRepository? |
412
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
38 /* |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
39 * To understand what Mercurial thinks of UTF-8 and Unix byte approach to names, see |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
40 * http://mercurial.808500.n3.nabble.com/Unicode-support-request-td3430704.html |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
41 */ |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
42 |
415
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
43 private final SessionContext sessionContext; |
412
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
44 private final CharsetEncoder encoder; |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
45 private final CharsetDecoder decoder; |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
46 |
415
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
47 EncodingHelper(Charset fsEncoding, SessionContext ctx) { |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
48 sessionContext = ctx; |
412
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
49 decoder = fsEncoding.newDecoder(); |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
50 encoder = fsEncoding.newEncoder(); |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
51 } |
320
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
52 |
418
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
53 /** |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
54 * Translate file names from manifest to amazing Unicode string |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
55 */ |
412
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
56 public String fromManifest(byte[] data, int start, int length) { |
418
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
57 return decodeWithSystemDefaultFallback(data, start, length); |
320
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
58 } |
418
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
59 |
415
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
60 /** |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
61 * @return byte representation of the string directly comparable to bytes in manifest |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
62 */ |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
63 public byte[] toManifest(String s) { |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
64 if (s == null) { |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
65 // perhaps, can return byte[0] in this case? |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
66 throw new IllegalArgumentException(); |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
67 } |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
68 try { |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
69 // synchonized(encoder) { |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
70 ByteBuffer bb = encoder.encode(CharBuffer.wrap(s)); |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
71 // } |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
72 byte[] rv = new byte[bb.remaining()]; |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
73 bb.get(rv, 0, rv.length); |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
74 return rv; |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
75 } catch (CharacterCodingException ex) { |
456
909306e412e2
Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
418
diff
changeset
|
76 sessionContext.getLog().dump(getClass(), Error, ex, String.format("Use of charset %s failed, resort to system default", charset().name())); |
415
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
77 // resort to system-default |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
78 return s.getBytes(); |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
79 } |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
80 } |
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
81 |
418
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
82 /** |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
83 * Translate file names from dirstate to amazing Unicode string |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
84 */ |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
85 public String fromDirstate(byte[] data, int start, int length) { |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
86 return decodeWithSystemDefaultFallback(data, start, length); |
412
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
87 } |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
88 |
418
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
89 private String decodeWithSystemDefaultFallback(byte[] data, int start, int length) { |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
90 try { |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
91 return decoder.decode(ByteBuffer.wrap(data, start, length)).toString(); |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
92 } catch (CharacterCodingException ex) { |
456
909306e412e2
Refactor LogFacility and SessionContext, better API for both
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
418
diff
changeset
|
93 sessionContext.getLog().dump(getClass(), Error, ex, String.format("Use of charset %s failed, resort to system default", charset().name())); |
418
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
94 // resort to system-default |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
95 return new String(data, start, length); |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
96 } |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
97 } |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
98 |
528b6780a8bd
A bit of FIXME cleanup (mostly degraded to TODO post 1.0), comments and javadoc
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
415
diff
changeset
|
99 private Charset charset() { |
412
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
100 return encoder.charset(); |
63c5a9d7ca3f
Follow-up for Issue 29: unify path translation for manifest and dirstate
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
321
diff
changeset
|
101 } |
415
ee8264d80747
Explicit constant for regular file flags, access to flags for a given file revision
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
412
diff
changeset
|
102 |
320
678e326fd27c
Issue 15: Exception accessing oddly named file from history
Artem Tikhomirov <tikhomirov.artem@gmail.com>
parents:
diff
changeset
|
103 } |