Top to bottom file representation order is not the same as Left to Right

Robin's Avatar

Robin

10 Apr, 2018 01:01 PM

MMD Composer edit display and preview display diverge for MMD tables in which pipes have been inserted in a RTL sequence of strings.

This appears to be because of an understandable (but technically incorrect, and somewhat provincial) assumption that LTR cell order the is 'natural' interpretation of top to bottom string sequence representation in a file.

LTR cell order is never 'natural', but is correct for LTR scripts.

If this is fixed, and RTL cell sequences allowed to be RTL, then the WYSIWYG failure in MMD Composer (and in MMD table rendering generally) will be corrected.

1. Producing `<td>` cells in the order of the source file stream is correct
2. Appending them to the Right, when the text is RTL is *incorrect*, and merely an expression of a provincial blind spot.

Leading to a WYSIWYG collapse in your preview, and the generation of incorrect HTML.

Example for testing in MMD Composer

```
The bidirectional cell here is displayed and rendered correctly.

| Example ||||
|:-------:|:-------:|:-------:|:-------:|
| Genesis בראשית ברא אלהים ||||
| alpha | beta | gamma | delta |

But when pipe characters are inserted between the RTL words, the generated RTL `<td>` cells are append to the Right instead of to the Left.

This is simply a technical error (revealed by the divergence of the editing and Preview displays) rooted in a provincial assumption that 'first then' is 'naturally' the same as left then right :-)

| Example ||||
|:-------:|:------:|:-----:|:-----:|
| Genesis | בראשית | ברא | אלהים |
| alpha | beta | gamma | delta |

```

Showing page 2 out of 2. View the first page

  1. Support Staff 31 Posted by fletcher on 15 Apr, 2018 08:21 PM

    fletcher's Avatar

    PS spreadsheets are irrelevant to the relationship between bidirectional text and MMD tables.

    Spreadsheet grids are not bidirectional.

    Text is.

    This is the part you're misunderstanding. Tables in MMD are spreadsheets. Once you grasp this, the rest falls into place.

  2. 32 Posted by robintrew on 15 Apr, 2018 09:20 PM

    robintrew's Avatar

    > MMD tables are spreadsheets

    Fantasy.

    MMD is a plain text markup.

    Plain text on macOS is Unicode-compliant these days. From the Bash shell upwards.

    Your MMD translation logic fails to grasp that, and discards user information.

    That's why the relationship between the left and right hand displays of MultiMarkdown Composer is broken.

    And will remain broken, in increasingly extreme and egregious isolation.

  3. 33 Posted by robintrew on 15 Apr, 2018 09:59 PM

    robintrew's Avatar

    > Tables in MMD are spreadsheets. Once you grasp this, the rest falls into place.

    Well ... an interesting (if eccentric) hypothesis, but it still fails experimentally.

    Even with mono-directional spreadsheets MMD fails.

    If we:

    1. Copy an area from an RTL spreadsheet in Google Sheets
    2. Paste and search-replace tabs to pipes

    - Core Text still gets it right
    - MMD still gets it wrong
    - The relationship between the Plain text and HTML displays is still broken

  4. Support Staff 34 Posted by fletcher on 15 Apr, 2018 10:49 PM

    fletcher's Avatar

    You have RTL enabled for your entire spreadsheet, so you would need to enable dir="rtl" for the generated HTML table.

    Alternatively, disable RTL in the spreadsheet itself (you can still use RTL script inside individual table cells). Then you do not need to use the dir attribute.

  5. 35 Posted by robintrew on 15 Apr, 2018 11:19 PM

    robintrew's Avatar

    Any standards-compliant plain text to HTML tool worth its dollars should be handling both:

    1. The `<table dir="rtl">` case, and
    2. the bidirectional (cell sequence) case.

    Once you have finally caught up with:

    1. Industry standards ( unicode.org/reports/tr9/ ) and
    2. the bulk of other macOS applications, (from bash cat vi and emacs upwards)

    the relationship between your plain text and HTML displays will no longer be broken.

  6. 36 Posted by robintrew on 22 Apr, 2018 03:13 PM

    robintrew's Avatar

    > Tables in MMD are spreadsheets. Once you grasp this, the rest falls into place.

    No. It doesn't.

    Real spreadsheets don't discard the direction information in the user's Unicode.

    See MS Excel in Right to Left mode below, and an MS Excel clipboard – tabs search-replaced to pipes, pasted into MDD Composer.

    The Core Text display on the left, like Google Sheets, and like MS Excel, gets the Unicode display direction right.

    MMD on the right, discards the user's display directions, and gets it wrong.

    If "tables in MMD are spreadsheets", then to bring them in line with MS Excel and Google Sheets, (let alone Unicode and MMD Composer) they need to be fixed.

  7. Support Staff 37 Posted by fletcher on 22 Apr, 2018 04:32 PM

    fletcher's Avatar

    Again -- you have RTL enabled for the spreadsheet, you would therefore need to enable it for the HTML document (or at least for the table) for the orientations to match.

    When you copy text from a spreadsheet, the RTL setting of the spreadsheet doesn't come with it. Only the contents of the cells. So when you paste the text, the default goes back to being LTR. CoreText will automatically invert the lines containing RTL text that do not also contain LTR text. And since punctuation, whitespace, and digits do not seem to reset the direction, they remain in RTL. But all of this is because CoreText is treating the MMD source as if it were plain text, not as if it were a markup language.

    Another way of thinking about it.... Column A is always "before" column B. If the document is LTR, this means that Column A is to the left of Column B. If the document is RTL, then column A is the right of Column B. If your HTML document is LTR, then Column A is left of Column B.

    Which brings us right back to where we started -- if you want to use a RTL spreadsheet, then the HTML document will need to be RTL also for them to match.

    Alternatively, use a LTR spreadsheet, but the individual cells can be whatever you want. Then you don't have to change the dir tag in your HTML.

  8. 38 Posted by robintrew on 22 Apr, 2018 05:04 PM

    robintrew's Avatar

    Perhaps you misunderstand not only Unicode display directions, but also the purpose and nature of MMD markup ?

    Let me explain. MMD is a plain text markup - the purpose of which is to convert from Unicode plain texts to equivalent HTML etc.

    The purpose of the user, on the other hand, is not to manually compose or mend the HTML – that's the job of the markup conversion supplier.

    Again Apple Numbers (like MS Excel and Google Sheets and the Core Text editor of MMD Composer) gets it right.

    The only tool that fails – apparently just through obdurate Unicode defiance (or perhaps, more generously, just through lack of awareness or understanding) – is the MMD converter.

  9. 39 Posted by robintrew on 22 Apr, 2018 05:06 PM

    robintrew's Avatar

    > When you copy text from a spreadsheet, the RTL setting of the spreadsheet doesn't come with it.

    Another complete technical misapprehension.

    All of the RTL direction information comes in the Unicode. That is why all tools but yours can get it right.

  10. 40 Posted by robintrew on 22 Apr, 2018 05:09 PM

    robintrew's Avatar

    Screenshot of Apple Numbers - source of data pasted to MMD Composer above.

    All of the direction direction information is contained in the clipboard Unicode.

  11. 41 Posted by robintrew on 22 Apr, 2018 05:49 PM

    robintrew's Avatar

    In particular, re

    http://unicode.org/reports/tr9/
    https://www.w3.org/International/articles/inline-bidi-markup/uba-basics

    In terms of the bidirectional component of the Unicode standards punctuations such as the pipe characters are members of the neutral class.

    They have no inherent directionality, and inherit it from the preceding (top to bottom, not necessarily 'left right') strong characters.

    Unicode-compliant tools like Core Text display the MMD digit cells (examples above) in the correct context-sensitive sequence because numeric strings, like punctuation, get their direction direction from the 'strong' characters (RTL or LTR) preceding them.

  12. Support Staff 42 Posted by fletcher on 22 Apr, 2018 07:19 PM

    fletcher's Avatar

    On 4/22/18 1:04 PM, robintrew wrote:

    Perhaps you misunderstand not only Unicode display directions, but also the purpose and nature of MMD markup ?

    Let me explain. MMD is a plain text markup - the purpose of which is to convert from Unicode plain texts to equivalent HTML etc.

    The purpose of the user, on the other hand, is not to manually compose or mend the HTML – that's the job of the markup conversion supplier.

    Again Apple Numbers (like MS Excel and Google Sheets and the Core Text editor of MMD Composer) gets it right.

    The only tool that fails – apparently just through obdurate Unicode defiance (or perhaps, more generously, just through lack of awareness or understanding) – is the MMD converter.

    And there I thought you had no sense of humor. Explaining to me the
    purpose of my own software.... That's a good one! I think it's a safe
    assumption that I know more about the purpose of MMD than anyone else on
    the planet....

  13. Support Staff 43 Posted by fletcher on 22 Apr, 2018 07:22 PM

    fletcher's Avatar

    On 4/22/18 1:09 PM, robintrew wrote:

    Screenshot of Apple Numbers

    Again -- you're using incomplete examples. You need a mix of English cells and Hebrew cells in the same row to test your hypothesis. Digits/Whitespace/Pipes are not sufficient to alter directionality.

    Another example -- copy a section from a RTL table and paste it into TextEdit. Now flip the spreadsheet so that it is RTL (leaving the cell contents the same). Paste that into another TextEdit document. The two text documents will be identical. The orientation of the original spreadsheet is lost.

  14. Support Staff 44 Posted by fletcher on 22 Apr, 2018 07:34 PM

    fletcher's Avatar

    I tried to use RTL for column ordering in Numbers to test there as well, but that seems to be locked to the system language. Are you able to toggle it back and forth like Excel/LibreOffice?

  15. 45 Posted by robintrew on 22 Apr, 2018 08:09 PM

    robintrew's Avatar

    > Numbers

    The:

    - Format > Text > Reverse Direction, and
    - Table > Reverse Table Direction

    Menu items are active if the users set of installed IMEs includes one or more RTL script

  16. 46 Posted by robintrew on 22 Apr, 2018 08:12 PM

    robintrew's Avatar

    > You need a mix of English cells and Hebrew cells in the same row to test your hypothesis. Digits/Whitespace/Pipes are not sufficient to alter directionality.

    That hypothesis was already confirmed by the Unicode.txt example far above (file attached there).

    The error in your methodology is simply that you are using one of the few remaining pieces of software on macOS which is not Unicode-compliant and (unlike the Core Text in your own MMD Composer) discards the Unicode display sequence data (TextEdit).

  17. 47 Posted by robintrew on 22 Apr, 2018 08:21 PM

    robintrew's Avatar

    > I know more about the purpose of MMD than anyone else on
    the planet

    One might have hoped so, but anxiety to define a bug as a righteous feature was clearly clouding judgement and memory.

    At one point you were pinning hopes on the possibility that MMD might turn out to be a spreadsheet.

  18. 48 Posted by robintrew on 22 Apr, 2018 10:30 PM

    robintrew's Avatar

    1. As you know TextEdit is a complete irrelevance – it is (very unusually at this date, and unlike the editor even in MMD Composer) not Unicode-compliant. It discards the User's Unicode display direction information.

    2. A spreadsheet is a special case with monotonically increasing cartesian grid references – it's completely irrelevant to the bidirectional row case, which arises with textual (e.g. lexical) tables. Your canonical example at http://fletcher.github.io/MultiMarkdown-5/tables.html (not a number in sight), is an explicit acknowledgement that not all tables are spreadsheets. MMD isn't even unicode-compliant at the level of RTL `<table>` elements, let alone at the level of bidirectional `<th>` and `<td>` sequences or rtl `<span>` elements.

    Your task is simple. All you have to do is raise MMD's game to the level of Unicode compliance achieved by vi, emacs, Core Text, Atom, Numbers, MS Office, Excel and Google Sheets etc

    Once you have done that, users will know that the tables which they have designed in Unicode plain text will be predictably and matchingly displayed in HTML etc.

  19. Support Staff 49 Posted by fletcher on 22 Apr, 2018 11:04 PM

    fletcher's Avatar

    On 4/22/18 6:30 PM, robintrew wrote:

    1. As you know TextEdit is a complete irrelevance – it is (very unusually at this date, and unlike the editor even in MMD Composer) not Unicode-compliant. It discards the User's Unicode display direction information.

    TextEdit uses the same Core Text engine that Composer uses -- that which is provided by Apple as part of macOS. Composer adds a few extra layers on top, but fundamentally they use the same code that is found throughout macOS almost every place where text is used (and on iOS, albeit with some slight differences).

    At this point, I'm calling it quits. Either my ability to explain what is going to you is not up to the task, your unwillingness to learn where your mistakes lie is too high, or a combination of both. I have provided multiple examples demonstrating the flaws in your logic, as well as to point out where your counter examples are flawed or inaccurate.

    • MultiMarkdown properly converts tables to HTML, whether they contain RTL or LTR text within the cells. (More precisely, there have been no examples given where MMD fails to give the correct HTML -- certainly it's possible (if not probable) that other errors exist. But this thread has failed to describe any.)

    • Apple's Core Text engine treats MMD tables as text, not as tables. It does not distinguish between the pipe character by itself and a pipe when used as a table cell delimiter. This causes it to "get confused" by RTL characters inside a table, causing the visual appearance to not match the actual underlying structure. While this is a "bug" in Core Text in the context of Composer, it's because Apple's text engine was not designed to support MMD tables -- it is probably the correct behavior for most other use cases and therefore not something that should be changed.

    • If one wants entire tables to be displayed in a RTL format in HTML, one needs to use the dir="rtl" attribute for the table, or for the document as a whole.

    • If one has a complex table containing both LTR and RTL text within cells, the best way to minimize confusion is to generate the table in a spreadsheet (e.g. LibreOffice). The spreadsheet itself should be in LTR mode (unless you plan on adding the dir="rtl" to the table or document as a whole, in which case your spreadsheet should be in RTL). Copy and paste into Composer using the "paste as table" and all should be good, generally saving you a bunch of time and frustration.

    • Alternatively, write the table using raw HTML and you can do whatever you like.

    This horse has been sufficiently beaten. I will not respond further to this thread, and any further posts that are not constructive will be deleted as spam.

    Just as fair warning, Tender, the program that manages these forums, has an automatic spam detection algorithm. If any user starts to have too many posts flagged as spam (I don't know what the threshold is -- it could even be a single post for all I know), then all posts (even legitimate ones) may be flagged by the program as spam. I tend to check the spam folder every week or two, so any posts improperly flagged as spam may take me a while to discover.

  20. fletcher closed this discussion on 22 Apr, 2018 11:04 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

Recent Discussions

14 Nov, 2018 04:54 AM
09 Nov, 2018 04:49 PM
09 Nov, 2018 04:46 PM
09 Nov, 2018 04:44 PM
11 Oct, 2018 02:59 AM

Recent Articles