Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add String.{Set,Map} ? #2302

Closed
wants to merge 1 commit into from
Closed

Add String.{Set,Map} ? #2302

wants to merge 1 commit into from

Conversation

nojb
Copy link
Contributor

@nojb nojb commented Mar 7, 2019

Opinions? One downside is that it breaks the idiom

module String = struct
  include String
  module Set = Set.Make (String)
end

which may be quite widespread.

@mshinwell
Copy link
Contributor

Hmm. I thought the example you gave worked now, actually.

I would like to see Set, Map and Tbl, so in favour.

@trefis
Copy link
Contributor

trefis commented Mar 7, 2019

That seems like an anti-pattern: you're adding dependencies between modules in the stdlib, which we probably want to avoid in general, forcing (IIUC) more things to get linked whenever someone uses String.

We had been using this pattern extensively in core for a long time, and have now been moving away from it in our new libraries (e.g. Base).
I'm no expert on the matter, so I'll let @diml comment, but, offhand, it feels like a bad move.

@mshinwell
Copy link
Contributor

I'm not sure on the scale of the stdlib that there is any real problem with dependencies here.

The alternative is presumably to adopt the Base-style approach within Set and Map, right? That's presumably rather more work, although I'm unsure how much exactly.

@dbuenzli
Copy link
Contributor

dbuenzli commented Mar 7, 2019

FWIW that's something I end up doing consistently in any non-library project, but I agree with @trefis it may not be a great idea in certain settings.

But then if you have #608 that may not matter, I'd be curious if js_of_ocaml would actually strip them if unused.

@trefis
Copy link
Contributor

trefis commented Mar 7, 2019

I'm not sure on the scale of the stdlib that there is any real problem with dependencies here.

Well, I suspect that with that way of thinking it might actually become a problem at some point, at which point you've already lost.

The alternative is presumably ...

... to do nothing?

But then if you have #608 that may not matter

That's a pretty big if.

@mshinwell
Copy link
Contributor

If there are people willing to work on this, doing nothing isn't really a good option. There is clear value in having the standard library provide basic data structures over basic types, without having to clutter up user code instantiating them, and with consistent easily-discoverable naming provided. These structures should fit within a framework such that in a typical program the majority---if not all---types can be equipped with them and satisfy standard interfaces. (The same goes for printing functions.)

It really is very convenient when you can just say X.Set.add or X.print or whatever (using the appropriate notation), for all of your "X"s: and it leads to better code. Particularly for sets and maps, I think they should often be used more, but tend not to be---at least without libraries such as Base---since they are not immediately at hand.

As far as the implementation goes, @sliquister is probably the expert on this, but from what I remember we didn't really have any problem with the submodule approach until our tree scaled up to be at a very large scale. I don't think it should be dismissed outright.

@bluddy
Copy link
Contributor

bluddy commented Mar 7, 2019

If we're concerned about creating inter-dependencies, perhaps another module e.g. Data could be used, that stores the relation between the types: Data.String.Map, Data.Int.Set etc.

@gasche
Copy link
Member

gasche commented Mar 7, 2019

(Data.String.Map sounds like a sensible idea to me, although I don't know about the new module name.)

@dbuenzli
Copy link
Contributor

dbuenzli commented Mar 7, 2019

(Data.String.Map sounds like a sensible idea to me, although I don't know about the new module name.)

I don't like this idea. I often find that beyond two module levels code readability suffers. Both because of the resulting length of identifiers which hinders writing compact code (by which I mean sequences of let one-liners) and from the visual noise introduced by too many dotted levels.

And of course you could not open such a module as it would override String... that is unless you call these module say String_{set,map} which is a bit meh or if you include String in those but frankly at that point I rather inverse the idea and have a Bare.{String,...} to open and the full glory in the base String.

Somehow the String module is the natural place where you'd like to have them.

@nojb
Copy link
Contributor Author

nojb commented Mar 8, 2019

If we want to avoid linking too much, how about defining a module CamlinternalString with

module Map = Map.Make (struct type t = string let compare = compare end)
module Set = Set.Make (struct type t = string let compare = compare end)

and then in string.ml,

module Set = CamlinternalString.Set
module Map = CamlinternalString.Map

Then, if I understand module aliases correctly, linking with String will not pull in Set and Map unless actually used.

@xclerc
Copy link
Contributor

xclerc commented Mar 8, 2019

@dbuenzli Wouldn't this hypothetical Data module used like
MoreLabels is?

@nojb nojb mentioned this pull request Mar 16, 2019
@nojb
Copy link
Contributor Author

nojb commented May 6, 2019

There isn't clear agreement on this PR, so closing.

@nojb nojb closed this May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants