This has been called a “fc-multicategory” by Tom Leinster, for example here.

I think this as also been called a “Hypervirtual double category” here, but I don’t remember if this is exactly the same notion or if there are some additional assumption in the second link.

Another established structure very close to what you want is **opetopic bicategories**, which are equivalent to classical bicategories but formulated opetopically; see e.g. §3 of Cheng 2003, *Opetopic bicategories: comparison with the classical theory*, and the subsection *Non-algebraic notions of bicategory* at the end of §3.4 of Leinster 2003, *Higher operads, higher categories*.

Concretely, in an opetopic bicategory $B$, you have a graph of 0-cells and 1-cells like in a normal bicategory; the source of a 2-cell is not just a 1-cell but a composable string of 1-cells; 2-cell composition looks just like what you’d expect; and the 1-cell composition condition says that for every composable sequence of 1-cells, there’s a *universal* 2-cell out of them, for a certain sense of universality.

Keeping all of this except the last condition — call such a thing an **opetopic bicategories minus 1-cell composition** — seems to give exactly what you’re asking for. In the one-object case, the 1-cell composition condition is exactly what Leinster calls *representability* of a multicategory (Def 3.3.1, ibid.) — so adding this back recovers the equivalence between monoidal categories and one-object bicategories:

(monoidal category) = (multicategory with representability) = (one-object opetopic bicategory minus 1-cell composition, with 1-cell composition) = (one-object opetopic bicategory) = (one-object bicategory)

Comparing this with Simon Henry’s answer, I would expect (opetopic bicategories minus 1-cell composition) should be fairly concretely equivalent to (fc-multicategories with only identity vertical 1-cells); indeed, Leinster hints at such a connection in the subsection mentioned above, though he doesn’t spell it out precisely.