Update ICU to 64.1 + Chromium patches
What's new in ICU 64.1:
- Unicode 12: 554 new characters, including 4 new scripts and 61 new
emoji characters.
- CLDR 35 locale data
http://blog.unicode.org/2019/03/unicode-cldr-version-35-languagelocale.html
- ICU 64 now uses "rearguard" TZ data. (Recent versions have used
"vanguard" data with certain overrides.) (ICU-20398)
- ICU data filtering: The ICU4C build accepts an optional filter
script that specifies a subset of the data to be built, with
whitelists and blacklists for locales and for resource bundle paths.
(ICU-10923, design doc)
- MessageFormat has new pattern syntax for specifying the style of
a date/time argument via a locale-independent skeleton rather than
a locale-specific pattern. (ICU-9622)
* Date/time skeletons use the same "::" prefix as number skeletons.
* Example MessageFormat pattern string:
"We close on {closing,date,::MMMMd} at {closing,time,::jm}."
- Many formatting APIs can now output a new type of result object
which is-a FormattedValue (Java & C++), or convertible to a
UFormattedValue (C).
* These combine the result strings with easy iteration over
FieldPosition metadata.
- New C++ class LocaleBuilder for building a Locale from subtags,
keywords, and extensions. (ICU-20328) Parallel to the existing
ICU4J ULocale.Builder class.
- For C++ MeasureUnit instances, there are now additional factory
methods that return units by value, not by pointer-with-ownership.
(ICU-20337)
- Various Out-Of-Memory (OOM) issues have been fixed. (ticket query)
- See http://site.icu-project.org/download/64 for more details.
The update steps are recorded :
https://chromium.googlesource.com/chromium/deps/icu/+log/20690c6..6d422ff
- Update update.sh to point to ICU's new repo location
- Import the pristine copy of ICU 64.1 and update BUILD
files with update.sh
- Update and apply locale data patches
1. patches/locale_google.patch:
* Google's internal ICU locale changes
* Simpler region names for Hong Kong and Macau in all locales
* Currency signs in ru and uk locales (do not include 'tr' locale changes)
* AM/PM, midnight, noon formatting for a few Indian locales
* Timezone name changes in Korean and Chinese locales
* Default digit for Arabic locale is European digits.
- patches/locale1.patch: Minor fixes for Korean
2. Breakiterator patches
- patches/wordbrk.patch for word.txt
a. Move full stops (U+002E, U+FF0E) from MidNumLet to MidNum so that
FQDN labels can be split at '.'
b. Move fullwidth digits (U+FF10 - U+FF19) from Ideographic to Numeric.
See http://unicode.org/cldr/trac/ticket/6555
- patches/khmer-dictbe.patch
Adjust parameters to use a smaller Khmer dictionary (khmerdict.txt).
https://unicode-org.atlassian.net/browse/ICU-9451
- Add several common Chinese words that were dropped previously to
source/data/cjdict/brkitr/cjdict.txt
patch: patches/cjdict.patch
upstream bug: https://unicode-org.atlassian.net/browse/ICU-10888
3. Build-related changes
- patches/configure.patch:
* Remove a section of configure that will cause breakage while
running runConfigureICU.
- patches/wpo.patch (only needed when icudata dll is used).
upstream bugs : https://unicode-org.atlassian.net/browse/ICU-8043
https://unicode-org.atlassian.net/browse/ICU-5701
- patches/data_symb.patch :
Put ICU_DATA_ENTRY_POINT(icudtXX_dat) in common when we use
the icu data file or icudt.dll
- patches/staticmutex.patch :
Change the static UMutex code to avoid static_initializers error.
upstream bug: https://unicode-org.atlassian.net/browse/ICU-20520
- patches/buildtool.patch :
Fix the build tool which ommited res_index.res */res_index.res files
upstream bug: https://unicode-org.atlassian.net/browse/ICU-20529
upstream PR: https://github.com/unicode-org/icu/pull/571/
4. Double conversion library build failure
- patches/double_conversion.patch
- upstream bugs:
https://unicode-org.atlassian.net/browse/ICU-13750
https://github.com/google/double-conversion/issues/66
5. ISO-2022-JP encoding (fromUnicode) change per WHATWG encoding spec.
- patches/iso2022jp.patch
- upstream bug:
https://unicode-org.atlassian.net/browse/ICU-20251
- ICU data files are rebuilt
Up to 67kB increase. Since we also save 43K in
https://chromium-review.googlesource.com/c/v8/v8/+/1478710 ,
the net increase is only 24KB.
** ICU Data Size Change **
Data Size ICU63 ICU64-1 DIFF
chromeos 10326064 10378624 52560
common 10326064 10394816 68752
cast 5126144 5101616 -24528
android 6355520 6406256 50736
ios 6315248 6372016 56768
flutter 880928 894752 13824
Created by:
git rev-list --reverse 20690c6..6d422ff | \
xargs git cherry-pick --strategy=recursive -X theirs
Bug: chromium:943348
Change-Id: Ia7f86abfa8625dd24aae2f71456abd679fda3dae
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/deps/icu/+/1552155
Reviewed-by: Jungshik Shin <jshin@chromium.org>
diff --git a/source/common/udata.cpp b/source/common/udata.cpp
index 07756bb..b62095c 100644
--- a/source/common/udata.cpp
+++ b/source/common/udata.cpp
@@ -831,8 +831,8 @@
* Use a specific mutex to avoid nested locks of the global mutex.
*/
#if MAP_IMPLEMENTATION==MAP_STDIO
- static UMutex extendICUDataMutex = U_MUTEX_INITIALIZER;
- umtx_lock(&extendICUDataMutex);
+ static UMutex *extendICUDataMutex = new UMutex();
+ umtx_lock(extendICUDataMutex);
#endif
if(!umtx_loadAcquire(gHaveTriedToLoadCommonData)) {
/* See if we can explicitly open a .dat file for the ICUData. */
@@ -868,7 +868,7 @@
/* Also handles a race through here before gHaveTriedToLoadCommonData is set. */
#if MAP_IMPLEMENTATION==MAP_STDIO
- umtx_unlock(&extendICUDataMutex);
+ umtx_unlock(extendICUDataMutex);
#endif
return didUpdate; /* Return true if ICUData pointer was updated. */
/* (Could potentially have been done by another thread racing */