Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align currency with no decimal point correctly #160

Merged
merged 1 commit into from
Jul 15, 2024

Conversation

RyanGibb
Copy link
Contributor

The docs say:

If an amount has no decimal point, the imaginary decimal point to the right of the least significant digit will align.

But this was not the behaviour I observed when calling LedgerAlign, there seemed to be an off by one error. For example, with g:ledger_align_at = 50, I would get:

2024-07-13 test
  Expense                                 £10000
  Expense                                     £10.10
  Assets

This PR subtracts one from pos if there is no decimal point to give:

2024-07-13 test
  Expense                                  £10000
  Expense                                     £10.10
  Assets

Which is what I expected.

I've made this PR with what corrects this for me, but I'm not familiar with the codebase so if I've overlooked something any feedback or advice is welcome.

@alerque
Copy link
Member

alerque commented Jul 13, 2024

I see the problem (and thanks for the contribution), but I don't think this is the right solution. Yet.

I don't typically use anything without a decimal separator so I hadn't noticed this before. I tried it is some of my ledgers and the "off by one" result is sometimes off by one, sometimes off by 2, and sometimes off by 3. So basically the whole thing is screwed.

For example if you switch to $ in your example I get a different result. That suggests that we're somehow completely miscalculating characters, but it isn't like the pound sign is a double-byte character either so I don't know how that is happening. I also tried it with negative amounts, and got yet a different alignment.

Rather than a 1 character stop gap I think we need to figure out why we're not calculating the width right at all.

@RyanGibb
Copy link
Contributor Author

Rather than a 1 character stop gap I think we need to figure out why we're not calculating the width right at all.

Agreed! I'll try and dig into the root cause of this when I find some time.

@RyanGibb
Copy link
Contributor Author

I think this is due to variable unicode character byte length. See:

2024-07-13 test
  Expense                                  $10000
  Expense                                     $10.10
  Expense                                 £10000
  Expense                                     £10.10
  Expense                                €10000
  Expense                                     €10.10
  Expense                                      10.10
  Assets

Note that in UTF-8 $ is one byte, £ is two, and € is three.

matchend returns the byte index, however.

If we change:

    if pos < 0
      " Find the position after the first digits
      let pos = matchend(rhs, '\m\d[^[:space:]]*')
    endif

To (stolen from decimalpos):

      let pos = matchend(rhs, '\m\d[^[:space:]]*')
      if pos > 0
        let pos = strchars(rhs[:pos])
      endif

This appears to fix it:

2024-07-13 test
  Expense                                  $10000
  Expense                                     $10.10
  Expense                                  £10000
  Expense                                     £10.10
  Expense                                  €10000
  Expense                                     €10.10
  Expense                                      10.10
  Expense                                       0
  Expense                                      1 XXX @ $1
  Assets

@RyanGibb
Copy link
Contributor Author

Numbers with a space after them are still mis-aligned by 1, though:

2024-07-13 test
  Expense                                 $10000 ; comment
  Expense                                     $10.10
  Expense                                  £10000
  Expense                                     £10.10
  Expense                                      1 XXX @ $1
  Assets

@RyanGibb
Copy link
Contributor Author

RyanGibb commented Jul 14, 2024

This seems to be because matchend returns the index after the match, which will include an extra char when calling strchars if there is a space after it.

So:

      let pos = matchend(rhs, '\m\d[^[:space:]]*') - 1
      if pos >= 0
        let pos = strchars(rhs[:pos])
      endif

Gives us:

2024-07-13 test
  Expense                                  $10000 ; comment
  Expense                                     $10.10 ; comment
  Expense                                  £10000	; comment
  Expense                                     £10.10
  Expense                                  €10000
  Expense                                     €10.10
  Expense                                      10.10
  Expense                                       0 GBP
  Expense                                       1 XXX @ $1
  Assets

@alerque
Copy link
Member

alerque commented Jul 14, 2024

In my incredulity over not having noticed the pound sign being a multi-byte character it has come to my attention that in my own personal finance records (where I handle a lot of currencies but only a trivial about of travel related things in Pounds) I've been using (U+20A4 LIRA SIGN) instead of £ (U+00A3 POUND SIGN) in my ledger transactions. Yikes. I'll need to re-train my fingers for Compose-L instead of Compose=L.

Also an even further side note, apparently historically pound sterling (and ever other numbers after decimalization) were supposed to use · (U+00B7 MIDDLE DOT) as their decimal separator with accommodation for . (U+0023 FULL STOP) when the former was not supported. Even UK keyboards not having an obvious way to enter interpuncts has lead the full stop to be pretty much de facto, but I'm now disappointed to find hledger does not support interpunct as a possible decimal separator at all! I may have to go fix that.

In the mean time this sounds like a much better diagnosis and fix. I'll give it a spin when I get a minute, but thanks for digging in and finding the real issue.

Did you poke around to see if we may be making the same mistake in any other calculations?

@RyanGibb
Copy link
Contributor Author

RyanGibb commented Jul 15, 2024

In my incredulity over not having noticed the pound sign being a multi-byte character it has come to my attention that in my own personal finance records (where I handle a lot of currencies but only a trivial about of travel related things in Pounds) I've been using (U+20A4 LIRA SIGN) instead of £ (U+00A3 POUND SIGN) in my ledger transactions. Yikes. I'll need to re-train my fingers for Compose-L instead of Compose=L.

Also an even further side note, apparently historically pound sterling (and ever other numbers after decimalization) were supposed to use · (U+00B7 MIDDLE DOT) as their decimal separator with accommodation for . (U+0023 FULL STOP) when the former was not supported. Even UK keyboards not having an obvious way to enter interpuncts has lead the full stop to be pretty much de facto, but I'm now disappointed to find hledger does not support interpunct as a possible decimal separator at all! I may have to go fix that.

Sounds like a minefield :-)

In the mean time this sounds like a much better diagnosis and fix. I'll give it a spin when I get a minute, but thanks for digging in and finding the real issue.

That's great! You're most welcome.

Did you poke around to see if we may be making the same mistake in any other calculations?

I couldn't see anything obvious, the other use of matchend uses strchars to support multi-byte strings as well.

@alerque alerque merged commit dbc683e into ledger:master Jul 15, 2024
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants